Ilya Sutskever
👤 PersonAppearances Over Time
Podcast Appearances
I think that is something that happens and I think it could explain a lot of what's going on.
If you combine this with generalization of the models actually being inadequate,
that has the potential to explain a lot of what we are seeing, this disconnect between eval performance and actual real-world performance, which is something that we don't today exactly even understand what we mean by that.
So I have a human analogy which might be helpful.
So even the case, let's take the case of competitive programming since you mentioned that.
And suppose you have two students.
One of them, work decided they want to be the best competitive programmers, so they will practice 10,000 hours for that domain.
They will solve all the problems, memorize all the proof techniques, and be very, very, you know...
be very skilled at quickly and correctly implementing all the algorithms, and by doing so, they became the best, one of the best.
Student number two thought, oh, competitive programming is cool.
Maybe they practiced for 100 hours, much, much less, and they also did really well.
Which one do you think is going to do better in their career later on?
The second.
Right?
And I think that's basically what's going on.
The models are much more like the first student, but even more, because then we say, okay...
So the model should be good at competitive programming.
So let's get every single competitive programming problem ever.
And then let's do some data augmentation so we have even more competitive programming problems.
And we train on that.