Rob Wiblin
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
One that could be that they don't capture the full range of performance, that either you end up like capped it on, I guess, flawed at the bottom or capped at the top, especially because we're talking about models here over many years doing many, many different things.
There's definitely lots of gaming that goes on or like lots of teaching to the test that occurs with people trying to make their models look good on these benchmarks.
I guess it's also a question of like what actually matters.
There's benchmarks for all kinds of different skills and maybe you should give some of these things much more weight than others.
I guess you could also have non-linearities in the effect, you know, the performance of a model and its economic effect that could be indeed probably is quite non-linear.
How good do you think this whole approach that Epoch and you have been using to figure out whether progress is speeding up or remaining about the same pace to getting at the kind of ground reality of what's going on?
I guess an example might be, you would say, well, the models are being gamed.
There's a bunch of teaching to the test happening now, but there was a bunch of teaching to the test happening two years ago and four years ago.
And so as long as that's not getting progressively worse, then the line is still reasonable.
I guess you're saying increasing rates of progress would be quite striking on the graph.
It probably would jump out at you and these effects wouldn't be enough to make it disappear.
So at the point that AI is people, or at least like is AI researchers rather than just being a tool for AI researchers, you might reasonably expect quite abrupt increases in progress in AI R&D and I guess like AI capabilities basically.
Are you a down vote on how abrupt that will be or whether that will occur at all?
Or do you buy that maybe it will take a bit longer than people are imagining, but you still think that that will happen?
But that's how it ends up happening over quite a number of years rather than months or something crazy.
Yeah, I think there's been a general phenomenon, I guess, over the years.
I guess every couple of years, there's kind of a freak out about AI timelines, and people start expecting a recursive self-improvement loop really quite soon, within a few years of that point.
My impression is that you've just been unmoved in either direction.
Why haven't you updated based on events that have occurred, results that have come out?
And why doesn't the general performance of reasoning models, I mean, I think that's one way of characterizing the update was people were shocked that RL was being applied to these models into reasoning.