Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Eliezer Yudkowsky

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
1716 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

There's not an air gap on the present methodology.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

If they can manipulate the operators or disjunction, find security holes in the system running them.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I agree that the macro security system has human holes and machine holes.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And then they could just exploit any hole.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Yep.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

So it could be that the critical moment is not when is it smart enough that everybody's about to fall over dead, but rather when is it smart enough that it can get onto a less controlled GPU cluster with it

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

faking the books on what's actually running on that GPU cluster and start improving itself without humans watching it.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And then it gets smart enough to kill everyone from there, but it wasn't smart enough to kill everyone at the critical moment when you like screwed up, when you needed to have done better by that point or everybody dies.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

So the problem is that what you can learn on the weak systems may not generalize to the very strong systems because the strong systems are going to be different in important ways.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Chris Ulla's team is...

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

has been working on mechanistic interpretability, understanding what is going on inside the giant inscrutable matrices of floating point numbers by taking a telescope to them and figuring out what is going on in there.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Have they made progress?

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Yes.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Have they made enough progress?

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Well, you can try to quantify this in different ways.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

One of the ways I've tried to quantify it is by putting up a prediction market on whether in 2026, we will have understood anything that goes on inside a giant transformer net that

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

was not known to us in 2006.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Like, we have now understood induction heads in these systems by dint of much research and great sweat and triumph, which is like a thing where if you go like AB, AB, AB, it'll be like, oh, I bet that continues AB.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And a bit more complicated than that.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

But the point is, like,