Ilya Sutskever
๐ค SpeakerAppearances Over Time
Podcast Appearances
I think there'll definitely be diminishing returns because you want people who think differently rather than the same.
I think that if they were literal copies of me, I'm not sure how much more incremental value you'd get.
I think that...
So the reason there has been no diversity, I believe, is because of pre-training.
All the pre-trained models are the same, pretty much, because they're pre-trained on the same data.
Now, RL and post-training is where some differentiation starts to emerge because different people come up with different RL training.
I would say there are two things to say.
I would say that the reason why I thought self-playful is interesting is
is because it offered a way to create models using compute only without data.
And if you think that data is the ultimate bottleneck, then using compute only is very interesting.
So that's what makes it interesting.
Now, the thing is...
That self-play, at least the way it was done in the past, when you have agents which somehow compete with each other, it's only good for developing a certain set of skills.
It is too narrow.
It's only good for negotiation, conflict, certain social skills.
strategizing, that kind of stuff.
And so if you care about those skills, then self-play will be useful.
Now, actually, I think that self-play did find a home, but just in a different form.
In a different form.
So things like debate,