Sergey Levine
๐ค SpeakerAppearances Over Time
Podcast Appearances
And now you put that together with language.
You put that together with all sorts of chain of thought reasoning.
And there's a lot of potential for the model to compose things in new ways.
I mean it's not that there's something good about having less memory to be clear.
Like I think that adding memory, adding longer context, all that stuff, adding higher resolution images, I think those things will make the model better.
But the reason why it's not the most important thing for the kind of skills that you saw when you visited us โ
At some level, I think it comes back to Moravec's paradox.
So Moravec's paradox is basically that it's like, you know, if you want to know one thing about robotics, it's like that's the thing.
Moravec's paradox says that basically in AI, the easy things are hard and the hard things are easy, meaning like the things that we take for granted, like picking up objects, perceiving the world, all that stuff, those are all the hard problems in AI.
And the things that we find challenging, like playing chess and doing calculus, actually are often the easier problems.
And I think this memory stuff is actually more of a paradox in disguise where we think that the cognitively demanding tasks that we do that we find hard that kind of cause us to think like, oh, man, I'm sweating.
I'm working so hard.
Those are the ones that require us to keep lots of stuff in memory, lots of stuff in our minds.
Like if you're solving some big math problem, if you're having a complicated technical conversation on a podcast, like those are things we have to keep all those pieces, all those puzzle pieces in your head.
If you're
doing a well-rehearsed task, if you are an Olympic swimmer and you're swimming with perfect form and you're like right there in the zone, like people even say like it's in the moment,
It's in the moment, right?
It's like you've practiced it so much, you've baked it into your neural network in your brain that you don't have to think carefully about keeping all that context, right?
So it really is just more of its paradox manifesting itself, but that doesn't mean that we don't need the memory.
It just means that if we want to match the level of dexterity and physical proficiency that people have, there's other things we should get right first and then gradually go up that stack