Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Eliezer Yudkowsky

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
1716 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

It meant years ago, about 20 years, 15 years, something like that, I was talking to a congressperson.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

who had become alarmed about the eventual prospects, and he wanted work on building AIs without emotions, because the emotional AIs were the scary ones, you see.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And some poor person at ARPA had come up with a research proposal whereby this congressman's panic and desire to fund this thing would...

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

go into something that the person at ARPA thought would be useful and had been munched around to where it would sound like the congressman-like work was happening on this, which, you know, of course, the congressperson had misunderstood the problem and did not understand where the danger came from.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And...

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

The issue is that you could do this in a certain precise way and maybe get something.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

When I say put up prizes on interpretability, I'm like, well...

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

because it's verifiable there as opposed to other places, you can tell whether or not good work actually happened.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

In this exact narrow case, if you do things in exactly the right way, you can maybe throw money at it and produce science instead of anti-science and nonsense.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And all the methods that I know of trying to throw money at this problem share this property of, well, if you do it exactly right, based on understanding exactly what tends to produce useful outputs or not, then you can add money to it in this way.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And the thing that I'm giving as an example here in front of this large audience is the most understandable of those.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Because there's other people like Chris Ola, and even more generally, you can tell whether or not interpretability progress has occurred.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

So if I say throw money at producing more interpretability, there's a chance somebody can do it that way, and it will actually produce useful results.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Then the other stuff just blurs off into being harder to target exactly than that.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

It looks like we took a much smaller...

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

set of transformer layers than the ones in the modern, bleeding-edge, state-of-the-art systems.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And after applying various tools and mathematical ideas and trying 20 different things, we have shown that this piece of the system is doing this kind of useful work.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

You can hope.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And it's probably true.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Like you would not expect the smaller tricks to go away.