Demis Hassabis
π€ SpeakerAppearances Over Time
Podcast Appearances
So actually now I think these general models are actually going to transfer to the embodied robotic setting without too much extra sort of special casing or extra data or extra effort, which is probably not what most people, even the top roboticists would have predicted five years ago.
Well, look, of course, we sort of pioneered all that area of thinking systems because that's what our original gaming systems all did, right? Go, AlphaGo, but actually most famously AlphaZero, which was our follow-up system that could play any two-player game. And there, you always have to think about your time budget, your compute budget you've got to actually do the planning part, right?
Well, look, of course, we sort of pioneered all that area of thinking systems because that's what our original gaming systems all did, right? Go, AlphaGo, but actually most famously AlphaZero, which was our follow-up system that could play any two-player game. And there, you always have to think about your time budget, your compute budget you've got to actually do the planning part, right?
Well, look, of course, we sort of pioneered all that area of thinking systems because that's what our original gaming systems all did, right? Go, AlphaGo, but actually most famously AlphaZero, which was our follow-up system that could play any two-player game. And there, you always have to think about your time budget, your compute budget you've got to actually do the planning part, right?
So the model you can pre-train, just like we do with our foundation models today. So you can play millions of games offline, and then you have your model of chess or your model of Go or whatever it is. But at test time, at runtime, you've only got one minute to think about your move, right? One minute times how many computers you've got running. So that's still a limited compute budget.
So the model you can pre-train, just like we do with our foundation models today. So you can play millions of games offline, and then you have your model of chess or your model of Go or whatever it is. But at test time, at runtime, you've only got one minute to think about your move, right? One minute times how many computers you've got running. So that's still a limited compute budget.
So the model you can pre-train, just like we do with our foundation models today. So you can play millions of games offline, and then you have your model of chess or your model of Go or whatever it is. But at test time, at runtime, you've only got one minute to think about your move, right? One minute times how many computers you've got running. So that's still a limited compute budget.
So what's very interesting today is there's this trade-off between do you use a more expensive, larger base model, foundation model, right? So in our case, we have different size names like Gemini Flash or Pro or even bigger, which is Ultra. But those models are more costly to run. So they take longer to run. But they're more accurate and they're more capable.
So what's very interesting today is there's this trade-off between do you use a more expensive, larger base model, foundation model, right? So in our case, we have different size names like Gemini Flash or Pro or even bigger, which is Ultra. But those models are more costly to run. So they take longer to run. But they're more accurate and they're more capable.
So what's very interesting today is there's this trade-off between do you use a more expensive, larger base model, foundation model, right? So in our case, we have different size names like Gemini Flash or Pro or even bigger, which is Ultra. But those models are more costly to run. So they take longer to run. But they're more accurate and they're more capable.
So you can run a bigger model with a shorter number of planning steps, or you can run a very efficient, smaller model that's slightly less powerful, but you can run it for many more steps. And it's actually, currently what we're finding is it's sort of roughly about equal. But of course, what we want to find is the Pareto frontier of that, right?
So you can run a bigger model with a shorter number of planning steps, or you can run a very efficient, smaller model that's slightly less powerful, but you can run it for many more steps. And it's actually, currently what we're finding is it's sort of roughly about equal. But of course, what we want to find is the Pareto frontier of that, right?
So you can run a bigger model with a shorter number of planning steps, or you can run a very efficient, smaller model that's slightly less powerful, but you can run it for many more steps. And it's actually, currently what we're finding is it's sort of roughly about equal. But of course, what we want to find is the Pareto frontier of that, right?
Like actually the exact right trade-off of the size of the model and the expense of running that model versus the amount of thinking time and thinking steps that you're able to do per unit of compute time. And I think that's actually fairly cutting-edge research right now that I think all the leading labs are probably experimenting on. And I think there's not a clear answer to that yet.
Like actually the exact right trade-off of the size of the model and the expense of running that model versus the amount of thinking time and thinking steps that you're able to do per unit of compute time. And I think that's actually fairly cutting-edge research right now that I think all the leading labs are probably experimenting on. And I think there's not a clear answer to that yet.
Like actually the exact right trade-off of the size of the model and the expense of running that model versus the amount of thinking time and thinking steps that you're able to do per unit of compute time. And I think that's actually fairly cutting-edge research right now that I think all the leading labs are probably experimenting on. And I think there's not a clear answer to that yet.
I think we are entering a new era in coding, which is gonna be very interesting. And as you say, all the leading labs are pushing on this frontier for many reasons. It's easy to create synthetic data. So that's another reason that everyone's pushing on this vector.
I think we are entering a new era in coding, which is gonna be very interesting. And as you say, all the leading labs are pushing on this frontier for many reasons. It's easy to create synthetic data. So that's another reason that everyone's pushing on this vector.
I think we are entering a new era in coding, which is gonna be very interesting. And as you say, all the leading labs are pushing on this frontier for many reasons. It's easy to create synthetic data. So that's another reason that everyone's pushing on this vector.
And I think we're going to move into a world where, you know, sometimes it's called vibe coding, where you're basically coding with natural language, really, right? And we've seen this before with computers, right? I remember when I first started programming, you know, in the 80s, we were doing assembler. And then of course, that seems crazy now, like why would you do machine code?