Dwarkesh Patel
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
This imitation learning has given us a good prior, given these models a good prior, but reasonable ways to approach problems.
And as we move towards the era of experience, as you call it, this prior is going to be the basis on which we teach these models from experience because this gives them the opportunity to get answers right some of the time.
And then on this, you can build, you can train them on experience.
Do you agree with that perspective?
I mean, I think they do.
You can literally ask them, what would you anticipate a user might say in response?
And they have a prediction.
Yeah.
Yeah.
So I think a capability like this does exist in context.
So it's interesting to watch a model do chain of thought, and then suppose it's trying to solve a math problem.
It'll say, okay, I'm going to approach this problem using this approach at first, and it'll write this out and be like, oh, wait, I just realized this is the wrong conceptual way to approach the problem.
I'm going to restart by this another approach.
And that flexibility is
does exist in context, right?
Do you have something else in mind, or do you just think that you need to extend this capability across longer horizons?
Isn't that literally what next token prediction is?
Prediction of what was next and then updating on the surprise?
Next token is what they should say, what the action should be.
Oh, yeah.