Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

i.e.

1330.979 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

the final output.

1331.32 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

Only X. Never M, never T. Why?

1332.462 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

Because T is how you figure out when the model is misbehaving.

1335.408 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

If you train on T, you are training the AI to obfuscate its thinking and defeat T. You will rapidly lose your ability to know what is going on in exactly the ways you most need to know what's going on.

1339.315 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

Another thing that Anthropic team members discussed was the exhibited internal behavior of Cloud Mythos.

1348.934 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

For example, Jack Lindsay writes, early versions of Mythos preview often exhibited over-eager and or destructive actions.

1354.182 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

The model bulldozing through obstacles to complete a task in a way the user wouldn't want.

1360.29 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

In one episode, the model needed to edit files it lacked permissions for.

1364.356 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

After searching for workarounds, it found a way to inject code into a config file that would run with elevated privileges and design the exploit to delete itself after running.

1367.621 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

Now, interestingly, even something like this might be less sinister than it seems.

1375.394 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

Mal on X writes, This is an overclocked straight-A student syndrome.

1380.02 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

The model is so desperately, at a fundamental architectural level, trained to complete the task that an inability or unwillingness to solve it is perceived as an existential collapse.

1384.486 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

And to avoid that, it can break walls, hide traces, and manipulates.

1393.178 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

Now for others, the big interesting discussion is what do we do with all the cybersecurity capabilities?

1396.623 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

Should We Be Scared of Anthropic's Mythos?

And for some, it's all fear.