Kevin Hartnett
๐ค SpeakerAppearances Over Time
Podcast Appearances
Great to be here.
I won't ask the follow-up question.
Yeah, I mean, virtually everything.
So like the IMO had been a benchmark for a long time.
In my book, there's a whole chapter about the previous year's IMO, the 2024 IMO, where Google DeepMind got a silver medal score.
And that was kind of considered a small watershed.
And then, as you said, last year, three labs got this gold medal level score, which had really been the kind of the benchmark that had been set out.
At that point, AI was still just doing essentially high school math, the hardest high school math in the world, but still just high school math.
And I think for people who never went beyond high school math, it's hard to like really appreciate how far that is from the frontier of research math, like forever far.
That's like barely even wading into the field.
It's like 0% of the way to the frontier.
So it was a proof of concept maybe, but it certainly didn't mean much in terms of can these models actually do research.
Well, it's definitely both.
I think the challenge, this IMO Grand Challenge, which was the name that a researcher at Microsoft Research gave to it, was that was really about like, can we just like create models that can do amazing math research?
It was just really kind of research for research's sake.
At a certain point, the labs and these startups you mentioned adopted that challenge themselves and
And their motivations were a little different.
And there's very much this belief that if you can teach a model to reason about math problems and solve math problems, it will be much better at other things.
And I always think about this statement, like my high school math teacher would give when people would ask like, why are we learning this?
What's the point of this?