Andrej Karpathy
๐ค SpeakerAppearances Over Time
Podcast Appearances
Yeah.
Well, it's the universal interface in the digital realm, I would say.
And there's a universal interface in the physical realm, which in my mind is a humanoid form factor kind of thing.
We can later talk about Optimus and so on.
But I feel like there's a...
They're kind of like a similar philosophy in some way, where the physical world is designed for the human form, and the digital world is designed for the human form of seeing the screen and using keyboard and mouse.
And so it's the universal interface that can basically command the digital infrastructure we've built up for ourselves.
And so it feels like a very powerful interface to command and to build on top of.
Now, to your question as to what I learned from that, it's interesting because the world of bits was basically too early, I think, at OpenAI at the time.
This is around 2015 or so, and the zeitgeist at that time was very different in AI from the zeitgeist today.
At the time, everyone was super excited about reinforcement learning from scratch.
This is the time of the Atari paper, where neural networks were playing Atari games and beating humans in some cases, AlphaGo and so on.
So everyone was very excited about training neural networks from scratch using reinforcement learning directly.
It turns out that reinforcement learning is an extremely inefficient way of training neural networks, because you're taking all these actions and all these observations, and you get some sparse rewards once in a while.
So you do all this stuff based on all these inputs, and once in a while you're told, you did a good thing, you did a bad thing.
And it's just an extremely hard problem.
You can't learn from that.
You can burn a forest, and you can sort of brute force through it.
And we saw that, I think, with Go and Dota and so on.
And it does work.