Eliezer Yudkowsky

You tried to optimize to imitate humans, and then you did some RLHF to them, and of course you didn't get perfect alignment because that's not what happens when you hill climb towards an outer loss function.

8953.72 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

You don't get inner alignment on it.

8970.626 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

But yeah, so...

8973.39 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I think that there is... So if you don't mind my taking some slight control of things and steering around to what I think is a good place to start...

8978.003 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I just failed to solve the control problem.

8986.728 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I've lost control of this thing.

8989.331 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Alignment, alignment.

8990.833 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Still aligned.

8993.416 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Control, yeah.

8994.097 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Okay, sure, yeah, you lost control.

8995.178 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

But we're still aligned.

8997.221 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Yeah, losing control isn't as bad as you lose control to an aligned system.

9000.705 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Yes, exactly, exactly.

9004.189 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

You have no idea of the horrors I will shortly unleash on this conversation.

9005.851 View full episode →

Lex Fridman Podcast

#368 – Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

So I think that there's like a Selen chapterist here, if I'm pronouncing those words remotely like correctly, because of course I only ever read them and not hear them spoken.

9016.478 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment