Tyler
๐ค SpeakerAppearances Over Time
Podcast Appearances
First of all, GPT-5.5 is our best model ever for general work or knowledge work.
And so it's really good at using computers.
The next thing is, well, what harness do you give the model?
Or what information is the model seeing?
And what tools does it have to use the computer?
A lot of early versions of giving a model access to a computer were just giving it screenshots of the computer.
But there's a lot of secret sauce in our implementation where the model actually gets text representations of what's on screen from frameworks like accessibility.
And so the model is much more efficient when it has access to all this information.
And then the last bit that I honestly think, well, I don't know if I would say it's underappreciated because I feel like people really appreciate it, but it's the level of craft that was put in to how it feels when the agent is using the computer.
So for instance, you could totally just have the agent click around on your screen
and just have that be invisible.
The computer is updating as clicks happen, but the team put a ton of care into exactly the animation that this mouse cursor takes as it goes between the different click positions.
Yeah, it was really fun to talk about that and jam on that with them.
We actually made some interesting trade-offs.
Having this animation actually slows down how quickly the agent can work by just a tiny bit.
But it means that it's so much easier as a human for me to understand the system and therefore to trust the system.
What do you tell them?
Okay, cool.
Is this person an engineer or are they like a knowledge worker?
Let's say not an engineer because I feel like, well, no, let's do, let's do both.