Demis Hassabis
π€ SpeakerAppearances Over Time
Podcast Appearances
Let's use voice.
Maybe even eventually things like touch.
And if you think about robotics and other things, sensors, other types of sensors.
So I think the world's about to become very exciting, I think, in the next few years as we start getting used to the idea of what true multimodality means.
Yeah, well, we're very excited about our progress with things like Gatto and RT2, you know, Robotic Transformer.
And we actually think, so we've always liked robotics and we've had, you know, amazing research and now we still have that going now because we like the fact that it's a data poor regime because that pushes us forward.
on very interesting research directions that we think are going to be useful anyway, like sampling efficiency and data efficiency in general and transfer learning, learning from simulation, transferring that to reality, all of these very sim to real, all of these very interesting actually general challenges that we would like to solve.
So the control problem.
So we've always pushed hard on that.
And actually, I think, so Ilya's right, that is more challenging because of the data problem.
But it's also, I think we're starting to see the beginnings of these large models being transferable to the robotics regime, learning in the general domain, language domain and other things, and then just treating tokens like Gato as any type of token.
The token could be an action, it could be a word, it could be a part of an image, a pixel or whatever it is.
And that's what I think true multimodality is.
And to begin with, it's harder to train a system like that than a straightforward text language system.
But actually, going back to our early conversation of transfer learning, you start seeing that a true multimodal system, the other modalities benefit some different modalities.
So you get better at language because you now understand a little bit about video.
So I do think it's harder to get going, but actually ultimately we'll have a more general, more capable system like that.
Yeah, we're still working on those kinds of systems, but you can imagine we're just trying to, those ideas we're trying to build into our future generations of Gemini, you know, to be able to do all of those things and
Yeah, so look, I think that we're making great progress with math and things like theorem proving and coding, but it's still interesting if one looks at, I mean, creativity in general and scientific endeavor in general, I think we're getting to the stage where our systems could help the best human scientists make their breakthroughs quicker, like almost triage the search space in some ways, or perhaps find a solution like AlphaFold does with a protein structure.
um but it can't it's they're not at the level where they can create the hypothesis themselves or ask the right question and any as any top scientists will tell you that that's the hardest part of science is actually asking the right question uh boiling down that space to like what's the critical question we should go after the critical problem and then formulating that problem in the right way to attack it