Starting with WebMCP

If you are following AI-related technology news, you are certainly aware of the permanent excitement. On a typical morning, a new frontier model by Anthropic is released and every second post on X talks about how this is “changing everything”. And everything is changed again after your lunch break when OpenAI launches its new model. Later that day, you find out that everything was disrupted again by a new Chinese open-weight model with a name that reminds you of the slightly suspicious product brand on Temu.

Through all the buzz, it is hard to recognize the really interesting news which could really have an impact. In my eyes, WebMCP falls into this category. WebMCP is the proposal for a new browser JavaScript interface and a standard for HTML attributes meant for providing context to AI agents. You can think of it as an extension to web applications which allows AI agents to understand how to interact with it.

But can’t agents already interact with webpages? Yes, there are different approaches to how agents are browsing the web today. The authors of the Draft list existing web automation techniques and explain that WebMCP is an additional layer for agents. In the announcement of WebMCP in the Chrome developer blog, André Cipriani Bandarra expects WebMCP to improve “speed, reliability, and precision” of agentic interactions. I would like to add that it will also make it cheaper to run agents.

Comparison to MCP and MCP Apps

This idea of making an application or a data source easily accessible for AI agents might sound familiar to you and the name already gives it away. The Model Context Protocol (MCP) is the established standard and you will find many similarities when reading the WebMCP Draft Community Group Report.

An interesting aspect when comparing WebMCP and MCP is: Who/What is interacting with the technology and in which setting?
According to the goals of WebMCP, the technology tries to focus on the human-in-the-loop approach. The authors have in mind that those web applications will be used by users in cooperation with agents. WebMCP is designed to allow users to delegate tasks to the agent, while the user is present and can confirm/decline critical operations. Of course, this design goal of the proposal does not mean that human oversight will be required. If a web application is augmented using WebMCP, it is possible and likely that agents will operate it without human oversight. MCP is ambivalent regarding whether there is human oversight.

MCP Apps is an interesting extension to the MCP standard. It allows a classical MCP server to return UI widgets which are then rendered by the MCP client. Those widgets are interactive and able to trigger tool calls on the MCP server and the client can update the widgets’ state.

There is in fact much more to be said about the comparison between WebMCP, MCP and MCP Apps but I think this can wait for another time.

How do I get started?

If you want to try out WebMCP you need the following:

Chrome version 146
to activate the feature flag enable-webmcp-testing
to install the Model Context Tool Inspector extension
to get a Gemini API key which you can get from Google AI Studio

When you have this, head over to the demo page and open the sidebar provided by the extension. There you need to add your API key and then you are ready to operate the website via the text input field of the sidebar.

Where do I learn how to implement WebMCP?

If you want to find out how to develop your own web applications with WebMCP, here are all the resources I have found:

the Draft Community Group Report
their GitHub project
the GitHum project containing the code of the demos and tools including the extension
and then there is this Google Docs file

Conclusion

This is bleeding edge and the topic is not easy to approach currently. It will be very interesting to see how WebMCP develops and I can see a good chance that it will be a hot topic in the near future.