Demis Hassabis
π€ SpeakerAppearances Over Time
Podcast Appearances
But as for going forwards, I think that there's still a lot of interesting ideas
things to be resolved around planning and how does the brain construct the right world models?
I studied, for example, how the brain does imagination, or you can think of it as mental simulation.
So how do we create very rich visual spatial simulations of the world in order for us to plan better?
I think that's a super promising direction in my opinion.
So, you know, we've got to carry on improving the large models and we've got to carry on basically making them more and more accurate predictors of the world.
So in effect, making them more and more reliable world models, that's clearly a necessary, but I would say probably not sufficient component of an AGI system.
And then on top of that, I would, you know, we're working on things like
alpha zero-like planning mechanisms on top that make use of that model in order to make concrete plans to achieve certain goals in the world and perhaps sort of chain thought together or lines of reasoning together and maybe use search to kind of explore massive spaces of possibility.
I think that's kind of missing from our current large models.
Well, I mean, one thing is Moore's law tends to help if every year, of course, more computation comes in.
But we focus a lot on sample efficient methods and reusing existing data, things like experience replay, and also just looking at more efficient ways.
I mean, the better your world model is, the more efficient your search can be.
So one example I always give with AlphaZero, our system to play Go and chess and any game, is that it's stronger than world champion level, human world champion level at all these games.
And it uses a lot less search than a brute force method like Deep Blue, say to play chess.
Deep Blue, one of these traditional Stockfish or Deep Blue systems would maybe look at millions of possible moves for every decision it's going to make.
AlphaZero and AlphaGo looked at around tens of thousands of possible positions in order to make a decision about what to move next.
But a human grandmaster, a human world champion probably only looks at a few hundreds of moves, even the top ones, in order to make their very good decision about what to play next.
So that suggests that...
obviously the brute force systems don't have any real model other than the heuristics about the game.