Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dario Amodei

πŸ‘€ Speaker
1816 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

I do not think biological attacks will necessarily be carried out the instant it becomes widely possible to do so.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

In fact, I would bet against that.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

But added up across millions of people and a few years of time, I think there is a serious risk of a major attack, and the consequences would be so severe, with casualties potentially in the millions or more, that I believe we have no choice but to take serious measures to prevent it.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

Subheading.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

Defenses.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

That brings us to how to defend against these risks.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

Here I see three things we can do.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

First, AI companies can put guardrails on their models to prevent them from helping to produce bioweapons.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

Anthropic is very actively doing this.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

Claude's constitution, which mostly focuses on high-level principles and values, has a small number of specific hardline prohibitions, and one of them relates to helping with the production of biological, or chemical, or nuclear, or radiological, weapons.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

But all models can be jailbroken, and so as a second line of defence, we've implemented, since mid-2025, when our tests showed our models were starting to get close to the threshold where they might begin to pose a risk, a classifier that specifically detects and blocks bioweapon-related outputs.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

We regularly upgrade and improve these classifiers, and have generally found them highly robust even against sophisticated adversarial attacks.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

These classifiers increase the costs to serve our models measurably.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

In some models, they are close to 5% of total inference costs and thus cut into our margins, but we feel that using them is the right thing to do.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

To their credit, some other AI companies have implemented classifiers as well.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

But not every company has, and there is also nothing requiring companies to keep their classifiers.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

I am concerned that over time there may be a prisoner's dilemma where companies can defect and lower their costs by removing classifiers.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

This is once again a classic negative externalities problem that can't be solved by the voluntary actions of Anthropic or any other single company alone.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

Voluntary industry standards may help, as may third-party evaluations and verification of the type done by AI security institutes and third-party evaluators.

LessWrong (Curated & Popular)
"Dario Amodei – The Adolescence of Technology" by habryka

But ultimately defence may require government action, which is the second thing we can do.