Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
Yes, exactly.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
A few hero runs later.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
You need them, but it's important that you don't overfit to them, right?
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
So there shouldn't be the be-all and end-all.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
So there's LM Arena, or it used to be called Elemsys, that turned out sort of organically to be one of the main ways people like to test these systems, at least the chatbots.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
Obviously, there's loads of academic benchmarks from the test mathematics and coding ability, general language ability, science ability, and so on.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
And then we have our own internal benchmarks that we care about.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
It's a kind of multi-objective optimization problem, right?
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
You don't want to be good at just one thing.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
We're trying to build general systems that are good across the board.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
And you try and make no regret improvements.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
So where you improve in like, you know, coding, but it doesn't reduce your performance in other areas, right?
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
So that's the hard part because you can, of course, you could put more coding data in or you could put more, I don't know, gaming data in, but then does it make worse your language system?
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
or in your translation systems and other things that you care about.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
So you've got to kind of continually monitor this increasingly larger and larger suite of benchmarks.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
And also when you stick them into products, these models, you also care about the direct usage and the direct stats and the signals that you're getting from the end users, whether they're coders or the average person using the chat interfaces.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
Yeah.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
And then other things that are even more esoteric come into play like, you know, the style of the persona of the system.
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
You know, how it, you know, is it verbose?
Lex Fridman Podcast
#475 β Demis Hassabis: Future of AI, Simulating Reality, Physics and Video Games
Is it succinct?