Eliezer Yudkowsky

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And then they could just exploit any hole.

3631.811 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Yep.

3635.311 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

So it could be that the critical moment is not when is it smart enough that everybody's about to fall over dead, but rather when is it smart enough that it can get onto a less controlled GPU cluster with it

3636.653 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

faking the books on what's actually running on that GPU cluster and start improving itself without humans watching it.

3655.19 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And then it gets smart enough to kill everyone from there, but it wasn't smart enough to kill everyone at the critical moment when you like screwed up, when you needed to have done better by that point or everybody dies.

3661.56 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

So the problem is that what you can learn on the weak systems may not generalize to the very strong systems because the strong systems are going to be different in important ways.

3698.896 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Chris Ulla's team is...

3713.084 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

has been working on mechanistic interpretability, understanding what is going on inside the giant inscrutable matrices of floating point numbers by taking a telescope to them and figuring out what is going on in there.

3715.028 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Have they made progress?

3729.002 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Yes.

3731.009 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Have they made enough progress?

3732.23 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Well, you can try to quantify this in different ways.

3735.814 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

One of the ways I've tried to quantify it is by putting up a prediction market on whether in 2026, we will have understood anything that goes on inside a giant transformer net that

3739.338 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

was not known to us in 2006.

3756.187 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Like, we have now understood induction heads in these systems by dint of much research and great sweat and triumph, which is like a thing where if you go like AB, AB, AB, it'll be like, oh, I bet that continues AB.

3761.394 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And a bit more complicated than that.

3779.256 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

But the point is, like,

3781.859 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

We knew about regular expressions in 2006, and these are pretty simple as regular expressions go.

3784.182 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

So this is a case where, by dint of great sweat, we understood what is going on inside a transformer, but it's not the thing that makes transformers smart.

3790.769 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

It's a kind of thing that we could have built by hand decades earlier.

3800.12 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment