Ahmed El-Kishky
๐ค SpeakerAppearances Over Time
Podcast Appearances
Maybe this is like a way to sort of see how good our models are.
And there's some like hint of truth to that.
When I joined GPT-4, I'd just been released.
It was kind of abysmal at these competitions.
It was, I think, like 300, 400 ELO, if you know chess terminology, which like places it solidly like sub-novice, somebody that just like, you know, just started.
So it wasn't really good at all.
And it was kind of...
In fact, quick funny story, when we tried to actually try these programming competitions problems, they were so difficult and the model was so bad at them, they'd actually kind of like crash the computer.
And so they'd write solutions that were so bad that the little virtual computer, the sandbox would give it would just crash.
It ran out of memory, it would just hang and we'd have to like, you know, delete it.
It was that bad.
But a lot of people at OpenAI really come from the competitive programming community.
From high school, they would do Olympiads, International Informatics Olympiad.
In college, they would do ICPC.
They would participate in these competitions like Google Code Jam.
And so there was this deep-seated community of competitive programmers.
And I think fundamentally they had a belief that competitive coding was a really good, I don't want to say benchmark, but a good way to sort of measure how well you were improving at reasoning.
So when we saw that it was really bad at them, many people were just like, hey, wouldn't it be great if our models could compete at the same level as like some of the brightest people in the world, some of the brightest students, competitive programmers?
That would be a good measure of like progress towards reasoning.
So, yeah, like we have Jakob Pachocki, who is our chief research scientist.