Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Eliezer Yudkowsky

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
1713 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Once you find those, you can tell that you found them.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

You can verify that the discovery is real.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

But it's a tiny, tiny bit of progress compared to how fast capabilities are going.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Because that is where you can tell that the answers are real.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And then outside of that, you have cases where it is hard for the funding agencies to tell who is talking nonsense and who is talking sense.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And so the entire field fails to thrive.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And if you give thumbs up to the AI whenever it can talk a human into agreeing with what it just said about alignment...

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I am not sure you are training it to output sense because I have seen the nonsense that has gotten thumbs up over the years.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And so, so just like, maybe you can just like put me in charge, but I can generalize.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I can extrapolate.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I can be like, oh,

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Maybe I'm not infallible either.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Maybe if you get something that is smart enough to get me to press thumbs up, it has learned to do that by fooling me and exploiting whatever flaws in myself I am not aware of.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And that ultimately could be summarized as the verifier is broken.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

When the verifier is broken, the more powerful suggester just learns to exploit the flaws in the verifier.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I think that you will find great difficulty getting AIs to help you with anything where you cannot tell for sure that the AI is right, once the AI tells you what the AI says is the answer.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

For sure, yes, but probabilistically.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Yeah, the probabilistic stuff is a giant wasteland of, you know, Eliezer and Paul Cristiano arguing with each other and EA going like, And that's with like two actually trustworthy systems that are not trying to deceive you.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

You're talking about the two humans?

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Myself and Paul Cristiano, yeah.