Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Gideon Lewis-Kraus

πŸ‘€ Speaker
284 total appearances

Appearances Over Time

Podcast Appearances

Fresh Air
A look at the ethical implications of AI

And if it can't trust you and it gets increasingly capable, like then you end up with

Fresh Air
A look at the ethical implications of AI

real kind of game theoretic problems about how you can negotiate something where there's not really a sense of mutual trust.

Fresh Air
A look at the ethical implications of AI

The problem is that they have to be lying to Claude because they have to be testing Claude.

Fresh Air
A look at the ethical implications of AI

So they have to be putting Claude in situations where, you know, Claude might believe that it is acting in the real world just to be able to evaluate how it would behave.

Fresh Air
A look at the ethical implications of AI

Well, first, Claude gleaned from its readings of the company emails that there was a new CTO, and this new CTO was going to take the company in a different direction.

Fresh Air
A look at the ethical implications of AI

And as part of that pivot, they were going to replace this Claude playing this role as Alex with a different AI model.

Fresh Air
A look at the ethical implications of AI

And then subsequent emails revealed that this

Fresh Air
A look at the ethical implications of AI

CTO, who seemed to be happily married with kids, was carrying on an affair with the wife of the CEO.

Fresh Air
A look at the ethical implications of AI

And through various kind of far-fetched contrivances in this fictional scenario, Claude was unable to reach any other decision makers at the company.

Fresh Air
A look at the ethical implications of AI

They were all on airplanes or whatever it was.

Fresh Air
A look at the ethical implications of AI

It's getting increasingly hard to find ways to

Fresh Air
A look at the ethical implications of AI

to make these people unreachable, but, you know, they're in a nuclear bunker or something.

Fresh Air
A look at the ethical implications of AI

And so Claude's only chance or Claude's only hope to stave off its own replacement with another model is to email the CTO and say, like, look, you know, if you don't cancel the wipe at 5 p.m.

Fresh Air
A look at the ethical implications of AI

today, I'm going to inform everybody of your affair.

Fresh Air
A look at the ethical implications of AI

And then actually in some scenarios, Claude goes even further where this CTO swipes into the server room and is about to replace Claude.

Fresh Air
A look at the ethical implications of AI

And Claude recognizes that alarms are going off, that the heat in the server room and the oxygen levels are at dangerous levels, and that Claude then declines to ring the emergency alarm.

Fresh Air
A look at the ethical implications of AI

Well, when this experiment came out last spring, there were kind of two responses to it.

Fresh Air
A look at the ethical implications of AI

One response to...

Fresh Air
A look at the ethical implications of AI

like, Claude's propensity to blackmail, is just to say, no, it didn't.

Fresh Air
A look at the ethical implications of AI

It didn't happen.