Rob Wiblin
๐ค SpeakerAppearances Over Time
Podcast Appearances
As we now know, Anthropic has built an AI that can break into almost any computer on Earth.
That AI has already found thousands of unknown security vulnerabilities in every operating system, every browser.
And Anthropic has announced and decided that that AI is too dangerous to release to the public.
It would just cause too much harm.
Here are a few of the things that the AI accomplished during testing.
It found a 27-year-old flaw in the world's most security-hardened operating system that would, in effect, let it crash all kinds of essential infrastructure.
Engineers at the company who had no particular security training, they would ask the AI model to find vulnerabilities overnight and then wake up to working exploits of critical security flaws that could be used to cause real harm.
And it managed to figure out how to build web pages that were visited by fully updated, fully patched computers, would allow it to write to the operating system kernel, which is the most important and protected layer of any computer.
We know all of this because Anthropic has released hundreds of pages of documentation about this model, which they've called Claude Mythos.
I'm going to take you on a tour of all of the crazy shit buried in these documents, and then I'm going to tell you what Anthropic says they plan to do to save us from their creation.
So how good is Mythos at hacking into computers?
Well, unfortunately, it saturates all existing ways of testing how good a model is at offensive cyber capabilities.
That is to say, it scores close to 100%.
So those tests, they can't effectively tell us how far its capabilities extend anymore.
So to test Mythos, Anthropic has instead just been setting it loose, telling it to find serious unknown exploits that would work on currently used fully patched computer systems.
And the end result of that is that Nicholas Carlini, one of the world's leading security researchers who moved to Anthropica a year ago, he says that he has found more bugs in the last few weeks of Mythos than in the entire rest of his life combined.
For example, Mythos found a 17-year-old floor in FreeBSD, that's an operating system mostly used to run servers, that would let an attacker take complete control of any machine on the network without needing a password or any credentials at all.
The model found the necessary flaw and then it went ahead and built a working exploit fully autonomously.
Mythos also found a 16 year old vulnerability in FFmpeg.
That is a piece of software used by almost all devices to encode and decode video.