Rob Wiblin
๐ค SpeakerAppearances Over Time
Podcast Appearances
But then, in the second half of 2025, a strange thing happened.
The reasoning models like 01 and 03 seemed to have a big impact inside the AI companies as well as in the public outside of them.
Sam Altman declared in January last year, we are now confident we know how to build AGI.
And Demis over at DeepMind, who's normally more circumspect, he said he thought AGI was probably three to five years away.
And as often the case, Dario from Anthropic had the most colorful turn of phrase.
He announced a country of geniuses and data center were quite likely to get that in the next two to three years.
We also saw huge levels of popular coverage of the AGI scenario known as AI 2027, in which AI research and development is fully automated in 2027.
And that then leads in the story to a powerful recursive self-improvement loop and a so-called intelligence explosion.
I think the massive coverage and engagement with that story definitely shifted the vibes as well.
Even at 80,000 Hours, where I work, made a video about the AI 2027 story that got a stupid number of views.
I'm not going to say exactly how many because I'm sure it will have gone up a whole bunch by the time we post this.
To an extent, all of this hype ran a touch ahead of what people actually believed.
When they put out the AI 2027 scenario, the writers, who are absolutely as bullish about AI as you can reasonably get, they still thought that we wouldn't get a superhuman coda in reality until a year and a half after what actually happens in their story.
But people really did get a lot more excited and then changed their minds back again.
So let's take a tour of some of the technical factors that drove that.
I think it's no mystery why lots of people, absolutely including me, got super excited about reasoning models when they arrived.
When they landed, it just suddenly felt like they could suddenly do so many things that the previous generation of AI failed horribly at.
But what made the shine wear off as time went on?
Well, the hope among people inside the industry and among AI enthusiasts had been that reinforcement learning on domains that are easily checkable, things like mathematics and coding, that that would generalize to other messier domains where it's a lot harder to say and to check if someone has actually done the right thing or gotten the right answer.
people were kind of primed to expect that this might work because fine-tuning models to follow instructions and be helpful to users really had generalized shockingly well across almost all the kinds of different things that users tend to ask AI models for.