Amit Jain
๐ค SpeakerAppearances Over Time
Podcast Appearances
then video, audio, and language together is basically a chance to build a universal simulator.
So one application of that, as you point out, is obviously for entertainment and to generate video, to automate or to make digital the act of creating video.
But in addition to that, it is the gateway to be able to build
actually general purpose robotics.
That is the second area that we are very deeply focused on and now starting to build out a team and the research area of that.
And if we want to have any shot at building general purpose robots, then the only way that happens is by giving them this level of general understanding of the universe so that they can reason in their head.
They can actually simulate every scenario in their head.
Like, okay, what would happen if I do this?
What would happen if I do something else?
So this level of learning and reasoning is essential.
So video, as it advances and as we scale these models, they will not only get better and better, they will get more accurate in simulating physics.
And this is the path to building physical intelligence.
And that's why Luma's mission is to build multimodal AGI that can generate, understand, and operate in the physical world.
And that is the end goal.
So there are two really important things for Luma.
Over the last couple of years, we worked tirelessly to solve the research problem for like, you know, really, really great video.
Now, the next frontier for us are these omni models that are able to reason in audio, video, language, text together.
So the research is the first one that we pay immense amount of attention to.
Now, to solve the research, of course, people and really, really brilliant people is one of those bottlenecks.