Dylan Patel
๐ค SpeakerAppearances Over Time
Podcast Appearances
So these environments can be very, very complicated.
So building those out is a challenge, right?
It was one thing to say, I'm taking all the internet data.
I'm going to filter it some.
I'm going to throw it to the model.
There's tons of engineering challenges there, for sure.
There's a different set of engineering challenges that take time to build out.
This is where like the whole like, oh, well then Dylan, what you're saying is you never need to make models bigger again, right?
Because you've already run out of data and until you figure out how to generate tons and tons of data, that's great.
But actually we haven't, you know, we've seen another angle where it's mostly just been pre-training scaling, right?
Is VO3 and Banana Nano, right?
These Google image and video models and Genie and like all these Google image and video models
And that's purely like scaling on multimodality, right?
The models still aren't that great at video and audio and images.
They're fine.
They could be a lot better.
So there's like angles of scaling there, right?
Because when I said we've run out of the internet, we've run out of the text.
There's tons of video and image and audio.
It's just so expensive.