Dario Amodei
๐ค SpeakerAppearances Over Time
Podcast Appearances
I've seen no evidence of that so far, but if things were to slow down, that perhaps could be one reason.
I've seen no evidence of that so far, but if things were to slow down, that perhaps could be one reason.
I've seen no evidence of that so far, but if things were to slow down, that perhaps could be one reason.
So right now, I think most of the frontier model companies, I would guess, are operating roughly you know, $1 billion scale plus or minus a factor of three, right? Those are the models that exist now or are being trained now.
So right now, I think most of the frontier model companies, I would guess, are operating roughly you know, $1 billion scale plus or minus a factor of three, right? Those are the models that exist now or are being trained now.
So right now, I think most of the frontier model companies, I would guess, are operating roughly you know, $1 billion scale plus or minus a factor of three, right? Those are the models that exist now or are being trained now.
I think next year we're going to go to a few billion and then 2026, we may go to, you know, above 10 billion and probably by 2027, their ambitions to build $100 billion clusters. And I think all of that actually will happen. There's a lot of determination to build the compute to do it within this country. And I would guess that it actually does happen.
I think next year we're going to go to a few billion and then 2026, we may go to, you know, above 10 billion and probably by 2027, their ambitions to build $100 billion clusters. And I think all of that actually will happen. There's a lot of determination to build the compute to do it within this country. And I would guess that it actually does happen.
I think next year we're going to go to a few billion and then 2026, we may go to, you know, above 10 billion and probably by 2027, their ambitions to build $100 billion clusters. And I think all of that actually will happen. There's a lot of determination to build the compute to do it within this country. And I would guess that it actually does happen.
Now, if we get to 100 billion, that's still not enough compute. That's still not enough scale. Then either we need even more scale or we need to develop some way of doing it more efficiently, of shifting the curve.
Now, if we get to 100 billion, that's still not enough compute. That's still not enough scale. Then either we need even more scale or we need to develop some way of doing it more efficiently, of shifting the curve.
Now, if we get to 100 billion, that's still not enough compute. That's still not enough scale. Then either we need even more scale or we need to develop some way of doing it more efficiently, of shifting the curve.
I think between all of these, one of the reasons I'm bullish about powerful AI happening so fast is just that if you extrapolate the next few points on the curve, we're very quickly getting towards human level ability, right? Some of the new models that we developed, some reasoning models that have come from other companies,
I think between all of these, one of the reasons I'm bullish about powerful AI happening so fast is just that if you extrapolate the next few points on the curve, we're very quickly getting towards human level ability, right? Some of the new models that we developed, some reasoning models that have come from other companies,
I think between all of these, one of the reasons I'm bullish about powerful AI happening so fast is just that if you extrapolate the next few points on the curve, we're very quickly getting towards human level ability, right? Some of the new models that we developed, some reasoning models that have come from other companies,
They're starting to get to what I would call the PhD or professional level, right? If you look at their coding ability, the latest model we released, Sonnet 3.5, the new or updated version, it gets something like 50% on Sweebench. And Sweebench is an example of a bunch of professional, real-world software engineering tasks. At the beginning of the year, I think the state of the art was 3% or 4%.
They're starting to get to what I would call the PhD or professional level, right? If you look at their coding ability, the latest model we released, Sonnet 3.5, the new or updated version, it gets something like 50% on Sweebench. And Sweebench is an example of a bunch of professional, real-world software engineering tasks. At the beginning of the year, I think the state of the art was 3% or 4%.
They're starting to get to what I would call the PhD or professional level, right? If you look at their coding ability, the latest model we released, Sonnet 3.5, the new or updated version, it gets something like 50% on Sweebench. And Sweebench is an example of a bunch of professional, real-world software engineering tasks. At the beginning of the year, I think the state of the art was 3% or 4%.
So in 10 months, we've gone from 3% to 50% on this task. And I think in another year, we'll probably be at 90%. I mean, I don't know, but might even be less than that. We've seen similar things in graduate level math, physics, and biology from models like OpenAI's 01.
So in 10 months, we've gone from 3% to 50% on this task. And I think in another year, we'll probably be at 90%. I mean, I don't know, but might even be less than that. We've seen similar things in graduate level math, physics, and biology from models like OpenAI's 01.