Dario Amodei

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

I think we even had a certainly eval because, you know, our, our model, again, one point model had this problem where like it had this annoying tick where it would like respond to a wide range of questions by saying, certainly I can help you with that. Certainly. I would be happy to do that. Certainly this is correct.

3562.378 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Um, uh, and so we had a, like, certainly eval, which is like, how, how often does the model say certainly? Yeah. Uh, uh, but, but look, this is just a whack-a-mole. Like, like what if it switches from certainly to definitely like, uh, uh, so, you know, every time we add a new eval and we're, we're always evaluating for all the old things.

3579.544 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

So we have hundreds of these evaluations, but we find that there's no substitute for human interacting with it. And so it's very much like the ordinary product development process. We have like hundreds of people within Anthropic bash the model. Then we do, you know, then we do external AB tests. Sometimes we'll run tests with contractors. We pay contractors to interact with the model.

3596.702 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

So you put all of these things together and it's still not perfect. You still see behaviors that you don't quite want to see, right? You know, you still see the model like refusing things that it just doesn't make sense to refuse, right? Um, but I, I, I think trying to, trying to solve this challenge, right.

3618.344 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Trying to stop the model from doing, you know, genuinely bad things that, you know, know what everyone agrees it shouldn't do. Right. You know, everyone, everyone, you know, everyone agrees that, you know, the model shouldn't talk about, you know, I, I don't know, child abuse material. Right. Like everyone agrees the model shouldn't do that.

3634.877 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Uh, but, but at the same time that it doesn't refuse in these dumb and stupid ways. I think drawing that line as finely as possible, approaching perfectly is still a challenge and we're getting better at it every day, but there's a lot to be solved. And again, I would point to that as an indicator of a challenge ahead in terms of steering much more powerful models. Yeah.

3650.626 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Do you think Claude 4.0 is ever coming out? I don't want to commit to any naming scheme because if I say here, we're going to have Claude 4 next year, and then we decide that we should start over because there's a new type of model. I don't want to commit to it. I would expect in a normal course of business that Claude 4 would come after Claude 3.5, but you never know in this wacky field, right?

3673.453 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Scaling is continuing. There will definitely be more powerful models coming from us than the models that exist today. That is certain. Or if there aren't, we've deeply failed as a company.

3704.144 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

As much as I'm excited about the benefits of these models, and we'll talk about that if we talk about machines of loving grace, I'm worried about the risks, and I continue to be worried about the risks. No one should think that machines of loving grace was me saying I'm no longer worried about the risks of these models. I think they're two sides of the same coin.

3721.196 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

The the power of the models and their ability to solve all these problems in biology, neuroscience, economic development, government, governance and peace, large parts of the economy. Those those come with risks as well. Right. With great power comes great responsibility. Right. That's the two are the two are paired. Things that are powerful can do good things and they can do bad things.

3741.986 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

I think of those risks as being in several different categories. Perhaps the two biggest risks that I think about, and that's not to say that there aren't risks today that are important, but when I think of the things that would happen on the grandest scale, one is what I call catastrophic misuse. These are misuse of the models in domains like cyber, bio, radiological, nuclear, right?

3766.699 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Things that could... harm or even kill thousands, even millions of people if they really, really go wrong. These are the number one priority to prevent. And here, I would just make a simple observation, which is that

3792.946 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

The models, you know, if I look today at people who have done really bad things in the world, I think actually humanity has been protected by the fact that the overlap between really smart, well-educated people and people who want to do really horrific things has generally been small. Like, you know, let's say I'm someone who, you know, I have a PhD in this field. I have a well-paying job.

3810.724 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

There's so much to lose. Why do I want to like, even assuming I'm completely evil, which most people are not, why would such a person risk their life, risk their legacy, their reputation to do something like truly, truly evil? If we had a lot more people like that, the world would be a much more dangerous place. And so my worry is that by being a much more intelligent agent,

3836.947 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

AI could break that correlation. And so I do have serious worries about that. I believe we can prevent those worries, but I think as a counterpoint to machines of loving grace, I wanna say that there's still serious risks. And the second range of risks would be the autonomy risks,

3864.89 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

which is the idea that models might on their own, particularly as we give them more agency than they've had in the past, particularly as we give them supervision over wider tasks like writing whole code bases or someday even effectively operating entire companies, They're on a long enough leash. Are they doing what we really want them to do?

3882.423 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

It's very difficult to even understand in detail what they're doing, let alone control it. And like I said, these early signs that it's hard to perfectly draw the boundary between things the model should do and things the model shouldn't do, that, you know, If you go to one side, you get things that are annoying and useless, and you go to the other side, you get other behaviors.

3906.664 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

If you fix one thing, it creates other problems. We're getting better and better at solving this. I don't think this is an unsolvable problem. I think this is a science, like the safety of airplanes or the safety of cars or the safety of drugs. I don't think there's any big thing we're missing. I just think we need to get better at controlling these models.

3928.558 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And so these are the two risks I'm worried about. And our responsible scaling plan, which I'll recognize is a very long-winded answer to your question. I love it. I love it. for its ability to do both of these bad things. So if I were to back up a little bit, I think we have an interesting dilemma with AI systems where they're not yet powerful enough to present these catastrophes.

3947.549 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

I don't know that they'll ever prevent these catastrophes. It's possible they won't, but the case for worry, the case for risk is strong enough that we should act now. And they're getting better very, very fast, right? I testified in the Senate that we might have serious bio risks within two to three years. That was about a year ago. Things have preceded a pace.

3984.12 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment