Rob Wiblin
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
That may sound a little odd, but they're probably right about that.
The thing is that other things become the primary bottleneck.
To know whether automated AI R&D is on the way or beginning to kick off,
We're apparently now relying on these general impressions from anthropic stuff, that this thing is powerful, but it doesn't yet seem good enough to replace many of us yet.
But I think we can apply some common sense to the big picture here.
Mythos has given us AI advances that we previously thought would take six months in just three months.
That naturally brings forward the point at which we're going to be able to automate the development of AI models by three months.
And if it's a sign that AI advances are now going to continue at twice the pace that they were before, then that effectively halves the time that we have to prepare for that point.
I don't know whether that means 10 years becomes five years or four years becomes two years, but the direction of the effect and the size of the effect is clear enough.
Before we wrap up, I want to draw your attention to a recurring theme in these reports that really stood out to me.
This is the first time that an AI company has published 300 pages about a model that it's decided not to release, despite the fact that it might earn them tens of billions of dollars if they did, maybe hundreds of billions of dollars.
It's also the first time that Anthropic decided to delay giving its own staff access to one of its models.
With every previous Claude, their practice has just been to let staff use it as soon as it's judge-ready during training.
But with Mythos, they were worried enough about it being misaligned and causing havoc or sabotage on their own systems.
that they held it back and ran a 24-hour alignment test before letting any employees use it.
But according to them, that wasn't enough.
The retrospective on that found that the 24-hour window did not pressure test the model sufficiently and that the most concerning behaviors only became evident later through much more extended use.
One of their lead researchers, Sam Bowman, he commented this week that working with this model has been a wild ride.
We've come a long way on safety, but we still expect the next capability jump of this scale to be a huge challenge.
The system card says directly that their current methods could easily be inadequate to prevent catastrophic misaligned actions in significantly more advanced systems.