Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dario Amodei

👤 Speaker
1816 total appearances

Appearances Over Time

Podcast Appearances

AI Haven't A Clue
The Origins of ANTHROPIC Part 2 : Consitutional AI and Mythos

This kind of idea that we came up with, our brilliant research team came up with, is this idea of giving Claude, which is our generative AI tool, something we call a constitution.

AI Haven't A Clue
The Origins of ANTHROPIC Part 2 : Consitutional AI and Mythos

It's called constitutional AI.

AI Haven't A Clue
The Origins of ANTHROPIC Part 2 : Consitutional AI and Mythos

You used to train these models to make sure, hey, they're not sort of saying nasty things or they're not replicating bad inherent biases, was you would do this technique called reinforcement learning from human feedback, which we also kind of co-worked on when we were at OpenAI.

AI Haven't A Clue
The Origins of ANTHROPIC Part 2 : Consitutional AI and Mythos

And reinforcement learning from human feedback is just essentially a way of giving the model grades.

AI Haven't A Clue
The Origins of ANTHROPIC Part 2 : Consitutional AI and Mythos

It's like A plus, you said something nice.

AI Haven't A Clue
The Origins of ANTHROPIC Part 2 : Consitutional AI and Mythos

D minus, you said something mean.

AI Haven't A Clue
The Origins of ANTHROPIC Part 2 : Consitutional AI and Mythos

And that was reasonably effective in changing some behavior of the models.

AI Haven't A Clue
The Origins of ANTHROPIC Part 2 : Consitutional AI and Mythos

But what we found was that a lot of subtler things about how models

AI Haven't A Clue
The Origins of ANTHROPIC Part 2 : Consitutional AI and Mythos

respond, react, things that they're sort of deeply believing are much harder to train out in those sort of individual cases.

AI Haven't A Clue
The Origins of ANTHROPIC Part 2 : Consitutional AI and Mythos

And so we had this idea of giving the model kind of a broader constitution in the way that you would in a society to basically say, what are ways of behaving and engaging that are good for humans, right, for people, and that don't perpetuate some of the problematic things that might be in training data.

Global News Podcast
The Global Story: The AI model that’s ‘too powerful’ to be released to the public

I want people to feel like Claude is someone who is, you know, interested in their well-being.

Global News Podcast
The Global Story: The AI model that’s ‘too powerful’ to be released to the public

So not just trying to say things that please them, but like genuinely like cares about their life going well insofar as whatever their conception of that is.

Global News Podcast
The Global Story: The AI model that’s ‘too powerful’ to be released to the public

isn't going to deceive them or manipulate them.

Global News Podcast
The Global Story: The AI model that’s ‘too powerful’ to be released to the public

Anthropic is rejecting an ultimatum from the Pentagon to lift the company's AI safeguards or risk being blacklisted.

TechStuff
Is Anthropic's Mythos Model Too Dangerous? - Week in Tech

Early on, it was clear to us that this model was going to be meaningfully better at cybersecurity capabilities.

TechStuff
Is Anthropic's Mythos Model Too Dangerous? - Week in Tech

There's a kind of accelerating exponential, but along that exponential, there are points of significance.

TechStuff
Is Anthropic's Mythos Model Too Dangerous? - Week in Tech

Claude Mitho's preview is a particularly big jump along that point.

TechStuff
Is Anthropic's Mythos Model Too Dangerous? - Week in Tech

We haven't trained it specifically to be good at cyber.

TechStuff
Is Anthropic's Mythos Model Too Dangerous? - Week in Tech

We trained it to be good at code, but as a side effect of being good at code, it's also good at cyber.

TechStuff
Is Anthropic's Mythos Model Too Dangerous? - Week in Tech

The model that we're experimenting with.

← Previous Page 1 of 91 Next →