Mark Zuckerberg
π€ SpeakerAppearances Over Time
Podcast Appearances
more along the lines of inference generating synthetic data to then go feed into the model.
So I don't know what that ratio is going to be, but I consider the generation of synthetic data to be more inference than training today.
But obviously, if you're doing it in order to train a model, it's part of the broader training process.
So I don't know, that's an open question is to kind of where what the balance of that and how that plays out.
I do think that there are going to be dynamics like that.
But I also think that there is a fundamental limitation on...
on kind of the network architecture, right?
Or the, the kind of model architecture, right?
So I think like a 70 billion model that kind of, we trained with a lot of three architecture can get better, right?
It can, it can keep going.
Like, like I was saying, it's, you know, we felt like if we kept on feeding it more data or rotated the high value tokens through again, then, then, you know, it would continue getting better.
But,
And we've seen a bunch of other people around the world, different companies basically take the Lama 2, 70 billion base, take that model architecture and then build a new model.
It's still the case that when you make a generational improvement to the Lama 3 $70 billion or the Lama 3 $405 billion, there's nothing open source anything like that today.
I think that it's a big step function and what people are going to be able to build on top of that I don't think can go infinitely from there.
I think there can be some optimization in that until you get to the next step function.
I think it's gonna be pretty fundamental.
I think it's gonna be more like the creation of computing in the first place, right?
So you'll get all these new apps
in the same way that when you got the web or you got mobile phones you got like people basically rethought all these experiences and a lot of things that weren't possible before now became possible um something that will happen but i think it's a much lower level innovation it's um it's it's going to be more like going from people didn't have computers to people have computers is my my sense um