Ahmed El-Kishky
๐ค SpeakerAppearances Over Time
Podcast Appearances
We didn't tell it to do that.
It had never done that before.
What had happened was just over time, as we trained our models, it just decided, hey, this is a valid strategy.
I'm getting really good results when I do this, so let's keep doing it.
And that's the beauty of RL.
That's so awesome.
Yeah.
When you look at the reasoning, I mean, it's doing tricks like this all over the place.
One of the big powerhouses of the ICPC was GPT-5.
So that's already kind of available to the public.
Right.
So the exact model...
that was used for like IMO and ICPC Dickstrom's reasoning model.
That one probably is not going to be specifically released, but the insights that were used to train that model will be eventually incorporated into later models.
Generally, when we do research,
Our goal is to bring it into the main models that we host on ChatGPT eventually.
So that's a big process.
We have to make sure our models are safe, aligned.
They're pleasant models to use in more things than just like competitive programming or math.
But our hope is to take the insights that we use to train those models and incorporate it into the next models.