Haseeb Qureshi
👤 SpeakerAppearances Over Time
Podcast Appearances
If all of a sudden this efficiency is getting captured by the user,
That's good for users.
It's good for crypto.
It ultimately means that like crypto users are going to benefit more and more and more from all the stuff that exists on chain, you know, but it's very counterintuitive how it's going to work and it's not going to happen all at once.
It's going to be a gradual process as these models get better and better.
And store it somewhere on the machine.
No, it's a great question.
So you point out something very deep, which is that the innovation that we've had in AI has all really been driven by large language models, right?
And we call them large language models because they're trained on corpuses of text.
Text, that's the keyword, text, okay?
It's a large language model.
Now we're trying to apply the massive innovations we've seen in large language models to other modalities, right?
We're trying to apply it now to images, to video, to robotics, to all these other things.
And what you see is that in text is where we have the most runaway capabilities.
And everything else is kind of, you know, it's pretty good, but it's nowhere near as robust and as powerful as in text.
When you are getting an AI agent to interact with a computer, so this is known as computer use, and all the different labs are trying to get their models to become better and better at computer use.
The problem with computer use is that if you are trying to get a model to interact with clicking a button on a screen,
Literally what you're doing is you're taking a picture of a screen, you're tokenizing that, you're turning it into like these patches, and you're trying to give the model some kind of deep representation of these patches when it's been trained on text.
Right.
Right.