Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

πŸ‘€ Speaker
15656 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1
Confidence: Medium

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Conceptually, what are we missing in terms of thinking about AI from the RL perspective?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Huh.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I guess you would think that to emulate the trillions of tokens in the corpus of internet text, you would have to build a world model.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

In fact, these models do seem to have very robust world models, and they're the best world models we've made to date in AI, right?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So what do you think that's missing?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Great.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I guess maybe the crux, and I'm curious if you disagree with this, is some people will say, okay, so...

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

This imitation learning has given us a good prior, given these models a good prior, but reasonable ways to approach problems.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And as we move towards the era of experience, as you call it, this prior is going to be the basis on which we teach these models from experience because this gives them the opportunity to get answers right some of the time.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And then on this, you can build, you can train them on experience.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Do you agree with that perspective?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I mean, I think they do.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

You can literally ask them, what would you anticipate a user might say in response?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And they have a prediction.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So I think a capability like this does exist in context.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So it's interesting to watch a model do chain of thought, and then suppose it's trying to solve a math problem.