Andrej Karpathy
👤 PersonAppearances Over Time
Podcast Appearances
How can we get an actual like full agent or in full entity that can actually interact in the world?
And I would say the Atari sort of deep reinforcement learning shift in 2013 or so was part of that early effort of agents in my mind, because it was an attempt to try to get agents that not just perceive the world, but also take actions and interact and get rewards from environments.
And at the time this was Atari games, right?
And I kind of feel like that was a misstep, actually.
And it was a misstep that actually even the early OpenAI that I was a part of, of course, kind of adopted.
Because at that time, the zeitgeist was reinforcement learning environments, games, game playing, beat games, get lots of different types of games.
And OpenAI was doing a lot of that.
So that was maybe like another prominent part of, I would say, AI where maybe for two or three or four years, everyone was doing reinforcement learning on games.
And basically, that was a little bit of a misstep.
And what I was trying to do at OpenAI actually is like, I was always a little bit suspicious of games as being like this thing that would actually lead to AGI because in my mind, you want something like an accountant or like something that's actually interacting with the real world.
And I just didn't see how games kind of like add up to it.
And so my project at OpenAI, for example, was within the scope of the Universe project on an agent that was using keyboard and mouse to operate web pages.
And I really wanted to have something that like interacts with, you know, the actual digital world that can do knowledge work.
And it just so turns out that this was extremely early, way too early.
So early that we shouldn't have been working on that, you know, because if you're just stumbling your way around and keyboard mashing and mouse clicking and trying to get rewards in these environments, your reward is too sparse and you just won't learn and you're going to burn a forest computing and you're never actually going to get something off the ground.
And so what you're missing is this power of representation in the neural network.
And so, for example, today, people are training those computer-using agents, but they're doing it on top of a large language model.
And so you actually have to get the language model first.
You have to get the representations first.