Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Eliezer Yudkowsky

๐Ÿ‘ค Person
1713 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I think that if, you know, like half of today's physicists stop wasting their lives on string theory or whatever and go off and study what goes on inside transformer networks, then in, you know, like 30, 40 years, we'd probably have a pretty good idea.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Do you think these large language models can reason?

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

They can play chess.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

How are they doing that without reasoning?

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I mean, in my writings on rationality, I have not gone making a big deal out of something called reason.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I have made more of a big deal out of something called probability theory.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And that's like, well, your reasoning...

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

But you're not doing it quite right, and you should reason this way instead.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And interestingly, people have started to get preliminary results showing that reinforcement learning by human feedback has made the GPT series worse in some ways.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

In particular, it used to be well-calibrated.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

If you trained it to put probabilities on things, it would say 80% probability and be right 8 times out of 10.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And if you apply reinforcement learning from human feedback, the nice graph of 70% 7 out of 10 sort of flattens out into the graph that humans use, where there's some very improbable stuff.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

likely, probable, maybe, which all means like around 40%, and then certain.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

So it's like it used to be able to use probabilities, but if you apply, but if you try to teach it to talk in a way that satisfies humans, it gets worse at probability in the same way that humans are.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

And that's a bug, not a feature.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

I would call it a bug.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

Although such a fascinating bug.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

But yeah, so like reasoning, like it's doing pretty well on various tests that people used to say would require reasoning.

Lex Fridman Podcast
#368 โ€“ Eliezer Yudkowsky: Dangers of AI and the End of Human Civilization

But, you know, rationality is about when you say 80%, does it happen eight times out of 10?