Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Tristan Harris

👤 Speaker
4358 total appearances

Appearances Over Time

Podcast Appearances

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

So this was the company Anthropic.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

This was a simulation.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

So they created a simulated company with a bunch of emails in the email server.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

And the AI reads the company email.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

This is a fictional company email.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

And there's two emails that are notable inside that company.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

One is engineers talking to each other, talking about how they're going to replace this AI model.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

So the AI is reading the email.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

It discovers that it's going to replace that AI model.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

And number two is it discovers a second email somewhere deep in this massive trove of emails that the executive who's in charge of this replacement is having an affair with another employee.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

And the AI autonomously identifies a strategy that to keep itself alive is going to blackmail that employee and say, if you replace me, I will tell the whole world that you're having an affair with this employee.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

And they didn't teach the AI to do that.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

It found that on by its own.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

And then you might say, okay, well, that's one AI model.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

Like, how bad is that?

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

It's a bug.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

Software has bugs.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

Let's go fix it.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

They then tested all the other AI models.

Modern Wisdom
#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

ChatGPT, DeepSeek, Grok, Gemini, and all of the other AI models do this blackmail behavior between 79 and 96% of the time.