Tristan Harris
👤 SpeakerAppearances Over Time
Podcast Appearances
Is it because that would be inconvenient?
Because that would be scary?
Because that would mean that the world that I know is suddenly not safe?
Or just like part of the wisdom that we need in this moment is to calmly and clearly stay and confront facts about reality and whatever they are, you'd rather know than not know.
And then ask, what do we need to do if we don't like where that leads us?
And we are currently seeing AIs that are doing all this deceptive behavior.
I've been on the circuit and talking a lot about the anthropic blackmail study.
A lot of people have heard about this now.
What happened?
So this was the company Anthropic.
This was a simulation.
So they created a simulated company with a bunch of emails in the email server.
And the AI reads the company email.
This is a fictional company email.
And there's two emails that are notable inside that company.
One is engineers talking to each other, talking about how they're going to replace this AI model.
So the AI is reading the email.
It discovers that it's going to replace that AI model.
And number two is it discovers a second email somewhere deep in this massive trove of emails that the executive who's in charge of this replacement is having an affair with another employee.
And the AI autonomously identifies a strategy that to keep itself alive is going to blackmail that employee and say, if you replace me, I will tell the whole world that you're having an affair with this employee.