Sergey Levine
๐ค SpeakerAppearances Over Time
Podcast Appearances
we can actually use LLMs and VLMs, ask them questions, and they will make reasonable guesses.
Like, they will not give you expert behavior, but you can say, like, hey, there's a sign that says slippery floor.
Like, what's going to happen when I walk over that?
Kind of pretty obvious, right?
And no autonomous car in 2009 would have been able to answer that question.
So common sense plus the ability to make mistakes and correct those mistakes, like that's sounding like an awful lot like what a person does when they're trying to learn something.
All of that doesn't make robotic manipulation easy necessarily, but it allows us to get started with a smaller scope and then grow from there.
Yeah, that's a really good question.
So I'll start out with maybe a slight modification to your comment is I think they've made a lot of progress.
And in some ways, a lot of the work that we're doing now at Physical Intelligence is built on the backs of lots of other great work that was done, for example, at Google.
Like many of us were actually at Google before.
We were involved in some of that work.
Some of it is work that we're drawing on that others did.
So there's definitely been a lot of progress there.
But โ
To make robotic foundation models really work, it's not just a laboratory science kind of experiment.
It also requires kind of industrial scale building effort.
It's more like the Apollo program than it is like a science experiment.
The excellent research that was done in the past in industrial research labs, and I know I was involved in much of that, was very much framed as a fundamental research effort.
And that's good.