Andy Halliday
๐ค SpeakerAppearances Over Time
Podcast Appearances
So they had to be looking at the screen and so on.
Then we talked about DOM, D-O-M, Document Object Model, which exists.
And you can instrument your website so that all of the tools and elements, the action elements, the buttons, the modal windows, everything else, those are a part of a model that can be implemented.
read by an agent, and that bypasses the requirement to look at the pixels and try to figure out the page first and then act like a human who's never used the web before and try to figure out where to click the cursor and then what buttons to press.
So DOM, I had said, was going to be the next kind of wave of action-oriented browser agents.
Now, Google comes out with WebMCP.
So how is that different?
Well, instead of using, Don, the low level representation of the nodes on the Web page, which agents have to kind of figure out, WebMCP gives high level action interfaces to agents.
So you can, for example, it's a new standard in API that any website developer can add to it.
It's kind of a turnkey package.
And you can expose these structured action objects on the thing.
They're functions with JSON schemas.
And they have return types.
And you expose those to the agents directly.
So what can you do with that?
Well, imagine that you can call an agent, can call a high level action like checkout now.
Right.
That's going to be a little package that's a skill on the Web site that the Web site can then, you know, activate.
You could filter results.