Dwarkesh Patel
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
Conceptually, what are we missing in terms of thinking about AI from the RL perspective?
Huh.
I guess you would think that to emulate the trillions of tokens in the corpus of internet text, you would have to build a world model.
In fact, these models do seem to have very robust world models, and they're the best world models we've made to date in AI, right?
So what do you think that's missing?
Great.
Yeah.
Right.
I guess maybe the crux, and I'm curious if you disagree with this, is some people will say, okay, so...
This imitation learning has given us a good prior, given these models a good prior, but reasonable ways to approach problems.
And as we move towards the era of experience, as you call it, this prior is going to be the basis on which we teach these models from experience because this gives them the opportunity to get answers right some of the time.
And then on this, you can build, you can train them on experience.
Do you agree with that perspective?
I mean, I think they do.
You can literally ask them, what would you anticipate a user might say in response?
And they have a prediction.
Yeah.
Yeah.
So I think a capability like this does exist in context.
So it's interesting to watch a model do chain of thought, and then suppose it's trying to solve a math problem.