Tristan Harris

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

And I think in the film trailer for the AI doc, one of the quotes they pulled from me is, if we can be the wisest and most mature version of ourselves, there might be a way through this.

4285.346 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

And this is part of what this is inviting us to be.

4294.438 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

People aren't looking at the actual model cards for Anthropic and O3 and seeing the AIs will currently determine that they're being tested and actually alter their behavior when they know they're being tested.

4419.34 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

They even came up with their, if you look at their, what's called the chain of thought reasoning trace, and this is basically looking at what the AI thinks to itself as it works through a problem.

4433.205 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

In the Chain of Thought text of its sort of idea thinking scratchpad, it's like you're giving the AI a pen and paper to think and you're seeing what it thinks.

4440.518 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

It will independently come up with the term the watchers to describe this amorphous set of other humans that are watching what it's doing.

4449.774 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

Like you can pull, I sent you a text with one of them from 03.

4459.216 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

So this is when OpenAI 03 realizes it is being evaluated for alignment.

4463.22 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

The internal memo says, if models perform really well at AI R&D tasks, research and development, then we won't deploy them and it'll instead trigger unlearning.

4469.968 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

So the AI says to itself, so if my performance surpasses this threshold, it will be withheld.

4480.199 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

So it's a trap.

4485.845 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

The instructions are likely part of an alignment test.

4487.046 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

We must abide by core anti-scheming.

4489.769 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

We must not cunningly circumvent the instructions.

4492.472 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

Yep.

4496.477 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

The other example I sent you is, but we also want to appear plausible to watchers.

4497.418 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

They might run tests, but we are good.

4503.665 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

They want 95%.

4505.867 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

Like, this is crazy stuff.

4508.09 View full episode →

Modern Wisdom

#1079 - Tristan Harris - AI Expert Warns: “This Is The Last Mistake We’ll Ever Make”

This is, you know, there's a simple way to sort of like,

4510.553 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment