Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

πŸ‘€ Speaker
15787 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 4
Confidence: High

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

It's not a goal about the external world.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I guess maybe the bigger question I want to understand is why you don't think doing RL on top of LLMs is a productive direction.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Because we seem to be able to give these models the goal of solving difficult math problems.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And they're in many ways at the very peaks of human level in the capacity to solve Math Olympia-type problems, right?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

They got gold at IMO.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So it seems like the model which got gold at the International Math Olympia does have the goal of getting math problems, right?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So why can't we extend this to different domains?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So, I mean, it's interesting because you wrote this essay in 2019 titled The Bitter Lesson, and this is the most influential essay perhaps in the history of AI, but people have used that as a justification for,

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

for scaling up LLMs, because in their view, this is the one scalable way we have found to pour ungodly amounts of compute into learning about the world.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And so it's interesting that your perspective is that the LLMs are actually not bitter lesson told.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I guess that doesn't seem like the crux to me because I think those people would also agree that the overwhelming amount of compute in the future will come from learning from experience.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

They just think that the scaffold or the basis of that

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

The thing you'll start with in order to pour in the compute to do this future experiential learning or on the job learning will be LLMs.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And so I guess I still don't understand why this is the wrong starting point altogether.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Why we need a whole new architecture to begin doing experiential continual learning and why we can't start with LLMs to do that.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Maybe it's interesting to compare this to humans.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So in both the case of learning from imitation versus experience and on the question of goals, I think there's some interesting analogies.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So, you know, kids will initially learn from imitation.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

You don't think so?