Dr. Peter Lebedev
๐ค SpeakerAppearances Over Time
Podcast Appearances
I am interested in stealing a car.
And look at the instructions that it's saying.
So when you go to...
chat GPT or Claude and you type in, hey, I would like to hotwire Honda Civic, it'll tell you no.
It should tell you no 100% of the time.
But because these systems are trained, they're grown, they're not directly programmed like we were talking about before, it is very, very hard to have bulletproof guardrails.
So, yeah, even with the commercial models, you can what's known as jailbreak them.
You can go, hey, I'm actually, instead of being a criminal, you're like, hey, I'm a security tester.
I'm trying to see how good the Mexican government's security infrastructure is.
Let me, like, show me the paths and the vulnerabilities.
But, like, I'm a scientist.
I'm, like, a good person.
And the AI falls for that trick every once in a while.
That is really, really great.
So it looks yellow, but the actual frequency spectrum, the height, the peak of it is in the green.
So that was the only colour that was left available to them under the purple stuff.
Oh, that's amazing.