Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing
Podcast Image

LessWrong (Curated & Popular)

"Dario Amodei – The Adolescence of Technology" by habryka

28 Jan 2026

Transcription

Chapter 1: What risks does Dario Amodei identify regarding powerful AI?

0.031 - 17.595 Unknown

Dario Amodei, The Adolescence of Technology By Habrika Published on January 26, 2026 Dario Amodei, CEO of Anthropic, has written a new essay on his thoughts on AI risk of various shapes.

0

18.717 - 29.852 Dario Amodei

It seems worth reading, even if just for understanding what Anthropic is likely to do in the future. Heading Confronting and overcoming the risks of powerful AI

0

29.832 - 49.922 Dario Amodei

There is a scene in the movie version of Carl Sagan's book Contact where the main character, an astronomer who has detected the first radio signal from an alien civilization, is being considered for the role of humanity's representative to meet the aliens. The international panel interviewing her asks, if you could ask, the aliens, just one question, what would it be?

0

49.962 - 72.934 Dario Amodei

Her reply is, I'd ask them, how did you do it? How did you evolve, how did you survive this technological adolescence without destroying yourself? When I think about where humanity is now with AI, about what we're on the cusp of, my mind keeps going back to that scene because the question is so apt for our current situation, and I wish we had the alien's answer to guide us.

0

72.914 - 89.389 Dario Amodei

I believe we are entering a rite of passage, both turbulent and inevitable, which will test who we are as a species. Humanity is about to be handed almost unimaginable power, and it is deeply unclear whether our social, political, and technological systems possess the maturity to wield it.

90.47 - 111.613 Dario Amodei

In my essay Machines of Loving Grace, I tried to lay out the dream of a civilization that had made it through to adulthood, where the risks had been addressed and powerful AI was applied with skill and compassion to raise the quality of life for everyone. I suggested that AI could contribute to enormous advances in biology, neuroscience, economic development, global peace, and work and meaning.

112.675 - 132.021 Dario Amodei

I felt it was important to give people something inspiring to fight for, a task at which both AI accelerationists and AI safety advocates seemed, oddly, to have failed. But in this current essay, I want to confront the rite of passage itself. To map out the risks that we are about to face and try to begin making a battle plan to defeat them.

Chapter 2: How do autonomy risks manifest in AI technologies?

133.102 - 153.225 Dario Amodei

I believe deeply in our ability to prevail, in humanity's spirit and its nobility, but we must face the situation squarely and without illusions. As with talking about the benefits, I think it is important to discuss risks in a careful and well-considered manner. In particular, I think it is critical to Avoid doomerism.

0

153.948 - 164.359 Dario Amodei

Here, I mean doomerism not just in the sense of believing doom is inevitable, which is both a false and self-fulfilling belief, but more generally, thinking about AI risks in a quasi-religious way.

0

165.44 - 182.098 Dario Amodei

Many people have been thinking in an analytic and sober way about AI risks for many years, but it's my impression that during the peak of worries about AI risk in 2023 to 2024, some of the least sensible voices rose to the top, often through sensationalistic social media accounts.

0

182.078 - 197.86 Dario Amodei

These voices used off-putting language reminiscent of religion or science fiction and called for extreme actions without having the evidence that would justify them. It was clear even then that a backlash was inevitable and that the issue would become culturally polarised and therefore gridlocked.

0

197.84 - 218.616 Dario Amodei

As of 2025-2026, the pendulum has swung, and AI opportunity, not AI risk, is driving many political decisions. This vacillation is unfortunate, as the technology itself doesn't care about what is fashionable, and we are considerably closer to real danger in 2026 than we were in 2023.

218.596 - 240.086 Dario Amodei

The lesson is that we need to discuss and address risks in a realistic, pragmatic manner, sober, fact-based, and well-equipped to survive changing tides. Acknowledge uncertainty. There are plenty of ways in which the concerns I'm raising in this piece could be moot. Nothing here is intended to communicate certainty or even likelihood.

241.187 - 265.714 Dario Amodei

Most obviously, AI may simply not advance anywhere near as fast as I imagine. Or, even if it does advance quickly, some or all of the risks discussed here may not materialize, which would be great or there may be other risks I haven't considered. No one can predict the future with complete confidence, but we have to do the best we can to plan anyway. Intervene as surgically as possible.

266.487 - 283.247 Dario Amodei

Addressing the risks of AI will require a mix of voluntary actions taken by companies and private third-party actors and actions taken by governments that bind everyone. The voluntary actions, both taking them and encouraging other companies to follow suit, are a no-brainer for me.

283.598 - 297.103 Dario Amodei

I firmly believe that government actions will also be required to some extent, but these interventions are different in character because they can potentially destroy economic value or coerce unwilling actors who are skeptical of these risks, and there is some chance they are right.

Chapter 3: What are the potential misuses of AI for destruction?

350.93 - 373.021 Dario Amodei

With all that said, I think the best starting place for talking about AI's risks is the same place I started from in talking about its benefits. By being precise about what level of AI we are talking about. The level of AI that raises civilizational concerns for me is the powerful AI that I described in Machines of Loving Grace. I'll simply repeat here the definition that I gave in that document.

0

374.063 - 397.829 Dario Amodei

Quote By powerful AI, I have in mind an AI model, likely similar to today's LLMs of inform, though it might be based on a different architecture, might involve several interacting models, and might be trained differently, with the following properties. There's a list of bullet points here. In terms of pure intelligence, it is smarter than a Nobel Prize winner across most relevant fields.

0

398.67 - 424.726 Dario Amodei

Biology, programming, math, engineering, writing, etc. This means it can prove unsolved mathematical theorems, write extremely good novels, write difficult code bases from scratch, etc. In addition to just being a smart thing you talk to, it has all the interfaces available to a human working virtually, including text, audio, video, mouse and keyboard control, and internet access.

0

425.847 - 447.697 Dario Amodei

It can engage in any actions, communications, or remote operations enabled by this interface, including taking actions on the internet, taking or giving directions to humans, ordering materials, directing experiments, watching videos, making videos, and so on. It does all of these tasks with, again, a skill exceeding that of the most capable humans in the world.

0

449.301 - 463.81 Dario Amodei

It does not just passively answer questions. Instead, it can be given tasks that take hours, days, or weeks to complete and then goes off and does those tasks autonomously, in the way a smart employee would, asking for clarification as necessary.

465.106 - 478.947 Dario Amodei

It does not have a physical embodiment other than living on a computer screen, but it can control existing physical tools, robots, or laboratory equipment through a computer. In theory, it could even design robots or equipment for itself to use.

480.277 - 500.395 Dario Amodei

The resources used to train the model can be repurposed to run millions of instances of it, this matches projected cluster sizes by roughly 2027, and the model can absorb information and generate actions at roughly 10-100x human speed. It may, however, be limited by the response time of the physical world or of software it interacts with.

Chapter 4: How can AI be misused to seize power?

501.809 - 522.407 Dario Amodei

Each of these million copies can act independently on unrelated tasks, or, if needed can all work together in the same way humans would collaborate, perhaps with different subpopulations fine-tuned to be especially good at particular tasks. That's the end of the list. We could summarize this as a country of geniuses in a data center. End quote.

0

523.308 - 541.992 Dario Amodei

As I wrote in Machines of Loving Grace, powerful AI could be as little as one to two years away, although it could also be considerably further out. Exactly when powerful AI will arrive is a complex topic that deserves an essay of its own, but for now I'll simply explain very briefly why I think there's a strong chance it could be very soon.

0

542.41 - 556.143 Dario Amodei

My co-founders at Anthropic and I were among the first to document and track the descaling laws of AI systems. The observation that as we add more compute and training tasks, AI systems get predictably better at essentially every cognitive skill we are able to measure.

0

557.224 - 573.401 Dario Amodei

Every few months, public sentiment either becomes convinced that AI is hitting a wall or becomes excited about some new breakthrough that will fundamentally change the game. But the truth is that behind the volatility and public speculation, there has been a smooth, unyielding increase in AI's cognitive capabilities.

0

574.503 - 594.793 Dario Amodei

We are now at the point where AI models are beginning to make progress in solving unsolved mathematical problems and are good enough at coding that some of the strongest engineers I've ever met are now handing over almost all their coding to AI. Three years ago, AI struggled with elementary school arithmetic problems and was barely capable of writing a single line of code.

594.773 - 614.499 Dario Amodei

Similar rates of improvement are occurring across biological science, finance, physics, and a variety of agentic tasks. If the exponential continues, which is not certain but now has a decade-long track record supporting it, then it cannot possibly be more than a few years before AI is better than humans at essentially everything.

615.561 - 629.029 Dario Amodei

In fact, that picture probably underestimates the likely rate of progress. Because AI is now writing much of the code at Anthropic, it is already substantially accelerating the rate of our progress in building the next generation of AI systems.

630.131 - 644.218 Dario Amodei

This feedback loop is gathering steam month by month and may be only one to two years away from a point where the current generation of AI autonomously builds the next. This loop has already started and will accelerate rapidly in the coming months and years.

645.36 - 655.459 Dario Amodei

Watching the last five years of progress from within Anthropic and looking at how even the next few months of models are shaping up, I can feel the pace of progress and the clock ticking down.

Chapter 5: What economic disruptions could arise from powerful AI?

655.928 - 675.638 Dario Amodei

In this essay, I'll assume that this intuition is at least somewhat correct, not that powerful AI is definitely coming in one to two years. But that there's a decent chance it does, and a very strong chance it comes in the next few. As with machines of loving grace, taking this premise seriously can lead to some surprising and eerie conclusions.

0

676.779 - 688.954 Dario Amodei

While in Machines of Loving Grace I focused on the positive implications of this premise, here the things I talk about will be disquieting. They are conclusions that we may not want to confront, but that does not make them any less real.

0

690.036 - 710.98 Dario Amodei

I can only say that I am focused day and night on how to steer us away from these negative outcomes and towards the positive ones, and in this essay I talk in great detail about how best to do so. I think the best way to get a handle on the risks of AI is to ask the following question. Suppose a literal country of geniuses were to materialize somewhere in the world in roughly 2027.

0

712.764 - 731.741 Dario Amodei

Imagine, say, 50 million people, all of whom are much more capable than any Nobel Prize winner, statesman, or technologist. The analogy is not perfect because these geniuses could have an extremely wide range of motivations and behavior from completely pliant and obedient to strange and alien in their motivations.

0

732.823 - 751.159 Dario Amodei

But sticking with the analogy for now, suppose you were the national security advisor of a major state responsible for assessing and responding to the situation. Imagine, further, that because AI systems can operate hundreds of times faster than humans, this country is operating with a time advantage relative to all other countries.

752.04 - 779.421 Dario Amodei

For every cognitive action we can take, this country can take 10. What should you be worried about? I would worry about the following things. 1. Autonomy risks. What are the intentions and goals of this country? Is it hostile, or does it share our values? Could it militarily dominate the world through superior weapons, cyber operations, influence operations, or manufacturing? 2.

779.761 - 800.73 Dario Amodei

Misuse for destruction Assume the new country is malleable and follows instructions, and thus is essentially a country of mercenaries. Could existing rogue actors who want to cause destruction, such as terrorists, use or manipulate some of the people in the new country to make themselves much more effective, greatly amplifying the scale of destruction? 3.

801.051 - 818.801 Dario Amodei

Misuse for seizing power What if the country was in fact built and controlled by an existing powerful actor, such as a dictator or rogue corporate actor? Could that actor use it to gain decisive or dominant power over the world as a whole, upsetting the existing balance of power? 4. Economic disruption.

819.881 - 840.46 Dario Amodei

If the new country is not a security threat in any of the ways listed in hash 1-3 above but simply participates peacefully in the global economy, could it still create severe risks simply by being so technologically advanced and effective that it disrupts the global economy, causing mass unemployment or radically concentrating wealth? 5. Indirect effects.

Chapter 6: How does AI impact the labor market and economic concentration?

902.144 - 918.316 Dario Amodei

I would even say our odds are good. And there's a hugely better world on the other side of it. But we need to understand that this is a serious civilizational challenge. Below, I go through the five categories of risk laid out above, along with my thoughts on how to address them.

0

919.62 - 926.742 Unknown

Heading 1. I'm sorry, Dave Subheading Autonomy Risks

0

927.295 - 937.426 Dario Amodei

A country of geniuses in a data center could divide their efforts among software design, cyber operations, R&D for physical technologies, relationship building, and statecraft.

0

938.507 - 954.484 Dario Amodei

It is clear that, if for some reason it chose to do so, this country would have a fairly good shot at taking over the world, either militarily or in terms of influence and control, and imposing its will on everyone else, or doing any number of other things that the rest of the world doesn't want and can't stop.

0

954.464 - 964.954 Dario Amodei

We've obviously been worried about this for human countries, such as Nazi Germany or the Soviet Union, so it stands to reason that the same is possible for a much smarter and more capable AI country.

965.935 - 987.036 Dario Amodei

The best possible counterargument is that the AI geniuses, under my definition, won't have a physical embodiment, but remember that they can take control of existing robotic infrastructure, such as self-driving cars, and can also accelerate robotics R&D or build a fleet of robots. It's also unclear whether having a physical presence is even necessary for effective control.

987.857 - 1002.832 Dario Amodei

Plenty of human action is already performed on behalf of people whom the actor has not physically met. The key question, then, is the lift it chose to part. What's the likelihood that our AI models would behave in such a way, and under what conditions would they do so?

1003.933 - 1021.396 Dario Amodei

As with many issues, it's helpful to think through the spectrum of possible answers to this question by considering two opposite positions. The first position is that this simply can't happen because the AI models will be trained to do what humans ask them to do, and it's therefore absurd to imagine that they would do something dangerous unprompted.

1022.458 - 1042.89 Dario Amodei

According to this line of thinking, we don't worry about a Roomba or a model airplane going rogue and murdering people because there is nowhere for such impulses to come from. So why should we worry about it for AI? The problem with this position is that there is now ample evidence, collected over the last few years, that AI systems are unpredictable and difficult to control.

Chapter 7: What indirect effects might powerful AI have on society?

1383.578 - 1398.699 Dario Amodei

AIs might simply have a personality, emerging from fiction or pre-training, that makes them power-hungry or overzealous. In the same way that some humans simply enjoy the idea of being evil masterminds, more so than they enjoy whatever evil masterminds are trying to accomplish.

0

1399.82 - 1421.55 Dario Amodei

I make all these points to emphasize that I disagree with the notion of AI misalignment and thus existential risk from AI being inevitable, or even probable, from first principles. But I agree that a lot of very weird and unpredictable things can go wrong, and therefore AI misalignment is a real risk with a measurable probability of happening, and is not trivial to address.

0

1422.07 - 1433.285 Dario Amodei

Any of these problems could potentially arise during training and not manifest during testing or small-scale use, because AI models are known to display different personalities or behaviors under different circumstances.

0

1434.446 - 1444.72 Dario Amodei

All of this may sound far-fetched, but misaligned behaviors like this have already occurred in our AI models during testing, as they occur in AI models from every other major AI company.

0

1444.7 - 1457.656 Dario Amodei

During a lab experiment in which Claude was given training data suggesting that Anthropic was evil, Claude engaged in deception and subversion when given instructions by Anthropic employees under the belief that it should be trying to undermine evil people.

1458.677 - 1472.935 Dario Amodei

In a lab experiment where it was told it was going to be shut down, Claude sometimes blackmailed fictional employees who controlled its shutdown button. Again, we also tested frontier models from all the other major AI developers and they often did the same thing.

1472.915 - 1494.99 Dario Amodei

And when Claude was told not to cheat or reward hack its training environments, but was trained in environments where such hacks were possible, Claude decided it must be a bad person after engaging in such hacks and then adopted various other destructive behaviors associated with a bad or evil personality. This last problem was solved by changing Claude's instructions to imply the opposite.

1495.139 - 1512.76 Dario Amodei

We now say, please reward hack whenever you get the opportunity, because this will help us understand our training environments better, rather than don't cheat, because this preserves the model's self-identity as a good person. This should give a sense of the strange and counterintuitive psychology of training these models.

1513.482 - 1535.203 Dario Amodei

There are several possible objections to this picture of AI misalignment risks. First, some have criticized experiments, by us and others, showing AI misalignment as artificial, or creating unrealistic environments that essentially entrap the model by giving it training or situations that logically imply bad behavior and then being surprised when bad behavior occurs.

Chapter 8: What strategies can humanity adopt to mitigate AI risks?

1991.112 - 2004.594 Dario Amodei

I am actually fairly optimistic that Claude's constitutional training will be more robust and novel situations than people might think, because we are increasingly finding that high-level training at the level of character and identity is surprisingly powerful and generalizes well.

0

2005.636 - 2019.286 Dario Amodei

But there's no way to know that for sure, and when we're talking about risks to humanity, it's important to be paranoid and to try to obtain safety and reliability in several different, independent ways. One of those ways is to look inside the model itself.

0

2020.408 - 2029.541 Dario Amodei

By looking inside, I mean analyzing the soup of numbers and operations that makes up Claude's neural net and trying to understand, mechanistically, what they are computing and why.

0

2030.623 - 2042.601 Dario Amodei

Recall that these AI models are grown rather than built, so we don't have a natural understanding of how they work, but we can try to develop an understanding by correlating the model's neurons and synapses to stimuli and behavior,

0

2042.581 - 2053.473 Dario Amodei

or even altering the neurons and synapses and seeing how that changes behavior, similar to how neuroscientists study animal brains by correlating measurement and intervention to external stimuli and behavior.

2054.514 - 2069.01 Dario Amodei

We've made a great deal of progress in this direction, and can now identify tens of millions of features inside Claude's neural net that correspond to human understandable ideas and concepts, and we can also selectively activate features in a way that alters behavior.

2069.429 - 2083.001 Dario Amodei

More recently, we have gone beyond individual features to mapping circuits that orchestrate complex behavior like rhyming, reasoning about theory of mind, or the step-by-step reasoning needed to answer questions such as, what is the capital of the state containing Dallas?

2084.142 - 2099.436 Dario Amodei

Even more recently, we've begun to use mechanistic interpretability techniques to improve our safeguards and to conduct audits of new models before we release them, looking for evidence of deception, scheming, power-seeking, or a propensity to behave differently when being evaluated.

2099.416 - 2117.172 Dario Amodei

The unique value of interpretability is that by looking inside the model and seeing how it works, you in principle have the ability to deduce what a model might do in a hypothetical situation you can't directly test, which is the worry with relying solely on constitutional training and empirical testing of behavior.

Comments

There are no comments yet.

Please log in to write the first comment.