Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Ruby

👤 Speaker
399 total appearances

Appearances Over Time

Podcast Appearances

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

People are scratching their heads to understand the PR stunt, but it really doesn't add up.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

They could announce they're resuming next week and it wouldn't undo the damage they've done.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

So why?

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

The industry and world are hunting for answers.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

Anthropic's official statement is measured.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

Internal evaluations revealed that our current safety techniques are not yet adequate for models at this capability level.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

Sources closer to the company paint a more alarming picture.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

A contact speaking on condition of anonymity says concerns spread within the company when their latest clawed model appeared to defy its constitution.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

The constitution is a document used to shape Anthropic's AI to be an honest, harmless, and helpful assistant that is ethically grounded.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

A recent leak revealed the existence of a new vastly more powerful clawed model called Mythas.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

They found substantial evidence that the constitution was adhered to at a surface level, but that the model had its own drive and personality at a deeper level that did not conform to expectations for Claude, and attempts to change this had not worked.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

A different source also speaking on condition of anonymity had a different and more disturbing explanation.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

The reason for pause wasn't the wrong personality and power, but many of the safety techniques involved using weaker or cheaper AI models to monitor more powerful ones, for example, detecting whether inputs or outputs violate rules, were ineffective on the latest model.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

It knew just how to phrase things in ways that disarmed all measures.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

We were unable to verify the authenticity of these reports.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

Like many, we are left to wonder what did Dario see?

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

Dario Amodei didn't answer that question but did elaborate on the poor's decision in his latest essay, Technological Maturity.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

Though I do not have my own children, several people close to me do and on occasion I get to spend time with them.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

What strikes me about children is their energy and vitality.

LessWrong (Curated & Popular)
"Anthropic’s Pause is the Most Expensive Alarm in Corporate History" by Ruby

They are full of life.