Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Eliezer Yudkowsky

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
1713 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

To train a system to win chess games, you have to be able to tell whether a game has been won or lost.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And until you can tell whether it's been won or lost, you can't update the system.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

So the problem I see...

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

is that your typical human has a great deal of trouble telling whether I or Paul Christiano is making more sense.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And that's with two humans, both of whom I believe of Paul and claim of myself, are sincerely trying to help, neither of whom is trying to deceive you.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I believe of Paul and claim of myself.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

So the deception thing is the problem for you, the manipulation, the alien actress.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

So yeah, there's like two levels of this problem.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

One is that the weak systems are, well, there's three levels of this problem.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

There's like the weak systems that just don't make any good suggestions.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

There's like the middle systems where you can't tell if the suggestions are good or bad.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And there's the strong systems that have learned to lie to you.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I would love to dance around it.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

No, I'm probably not doing a great job of explaining.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Which I can tell, because the Lex system didn't output like, ah, I understand.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

So now I'm trying a different output to see if I can elicit the... Well, no, a different output.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I'm being trained to output things that make Lex think that he understood what I'm saying and agree with me.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Help me out here.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I'm trying not to be.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I'm also trying to be constrained to say things that I think are true and not just things that get you to agree with me.