Dario Amodei

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

I mean, it depends what you mean by special and special in general, but I generally think The same kinds of techniques that we've been using to train the current model, I expect that doubling down on those techniques in the same way that we have for code, for models in general, for image input, for voice, I expect those same techniques will scale here as they have everywhere else.

4915.801 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Yeah, yeah, no. And we've been very aware of that. Look, my view actually is computer use isn't a fundamentally new capability like the CBRN or autonomy capabilities are. It's more like it kind of opens the aperture for the model to use and apply its existing abilities.

4952.406 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And so the way we think about it, going back to our RSP, is nothing that this model is doing inherently increases the risk from an RSP perspective. But as the models get more powerful, having this capability may make it scarier once it has the cognitive capability to...

4971.002 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

you know, to do something at the ASL three and ASL four level, this, this, you know, this may be the thing that kind of unbounds it from doing so. So going forward, certainly this modality of interaction is something we have tested for and that we will continue to test for an RSP going forward.

4995.322 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

I think it's probably better to have, to learn and explore this capability before the model is super, you know, super capable.

5012.255 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Yeah, I mean, we've thought a lot about things like spam, CAPTCHA, you know, mass camp. There's all, you know, every, like, one secret I'll tell you, if you've invented a new technology, not necessarily the biggest misuse, but the first misuse you'll see, scams, just petty scams. Yeah. Just, just, just, it's, it's like, it's like a thing as old people scamming each other.

5039.335 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

It's, it's this, it's this thing as old as time. And, and, and it's just every time you got to deal with it. It's almost like silly to say, but it's, it's true.

5061.584 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Yeah.

5075.192 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

there are a lot of, like, like I said, like there are a lot of petty criminals in the world and, and, and, you know, it's like every new technology is like a new way for petty, petty criminals to do something, you know, something stupid and malicious. Uh,

5076.653 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Yeah, we sandbox during training. So for example, during training, we didn't expose the model to the internet. I think that's probably a bad idea during training because the model can be changing its policy. It can be changing what it's doing and it's having an effect in the real world. You know, in terms of actually deploying the model, right, it kind of depends on the application.

5095.702 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Like, you know, sometimes you want the model to do something in the real world. But of course, you can always put guardrails on the outside, right? You can say, okay, well, you know, this model is not going to move data from my computer. you know, model's not going to move any files from my computer or my web server to anywhere else.

5116.164 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Now, when you talk about sandboxing, again, when we get to ASL 4, none of these precautions are going to make sense there, right? Where when you talk about ASL 4, you're then, the model is being kind of, you know, there's a theoretical worry the model could be smart enough to break it, to kind of break out of any box.

5133.158 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And so there we need to think about mechanistic interpretability, about, you know, if we're going to have a sandbox, it would need to be a mathematically provable sandbox. You know, that's a whole different world than what we're dealing with with the models today.

5152.566 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

I think it's probably not the right approach. I think the right approach, instead of having something, you know, unaligned that, that like you're trying to prevent it from escaping, I think it's, it's better to just design the model the right way or have a loop where you, you know, you look inside, you look inside the model and you're able to verify properties.

5174.04 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And that gives you a, an opportunity to like iterate and actually get it right. Um, I think, I think containing, uh, containing bad models is, is, is much worse solution than having good models. Yeah.

5190.744 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Yes. We ended up making some suggestions to the bill, and then some of those were adopted. And, you know, we felt, I think, quite positively about the bill by the end of that. It did still have some downsides. And, you know, of course, of course, it got vetoed. I think at a high level, I think some of the key ideas behind the bill are, you know, I would say similar to ideas behind our RSPs.

5215.917 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And I think it's very important that some jurisdiction, whether it's California or the federal government and or other other countries and other states, passes some regulation like this. And I can talk through why I think that's so important. So I feel good about our RSP. It's not perfect.

5243.79 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

It needs to be iterated on a lot, but it's been a good forcing function for getting the company to take these risks seriously, to put them into product planning, to really make them a central part of work at Anthropic and to make sure that all of a thousand people, and it's almost a thousand people now at Anthropic, understand that this is one of the highest priorities of the company, if not the highest priority.

5260.475 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

But One, there are still some companies that don't have RSP-like mechanisms. Like OpenAI, Google did adopt these mechanisms a couple months after Anthropic did. But there are other companies out there that don't have these mechanisms at all. And so if some companies adopt these mechanisms and others don't,

5284.395 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

don't, it's really going to create a situation where some of these dangers have the property that it doesn't matter if three out of five of the companies are being safe. If the other two are being unsafe, it creates this negative externality. And I think the lack of uniformity is not fair to those of us who have put a lot of effort into being very thoughtful about these procedures.

5308.257 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment