Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

πŸ‘€ Speaker
15656 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1
Confidence: Medium

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Some thoughts on the Sutton interview

With LLMs, we're going the opposite way.

Dwarkesh Podcast
Some thoughts on the Sutton interview

We have first made this base model that does pure imitation learning, and then we're hoping that we do enough RL on it to make a coherent agent with goals and self-awareness.

Dwarkesh Podcast
Some thoughts on the Sutton interview

Maybe this won't work.

Dwarkesh Podcast
Some thoughts on the Sutton interview

But I don't think these super first principles arguments about, for example, how these LMs don't have a true world model are actually proving much.

Dwarkesh Podcast
Some thoughts on the Sutton interview

And I also don't think they're strictly accurate for the models we have today, which are actually undergoing a lot of RL on ground truth.

Dwarkesh Podcast
Some thoughts on the Sutton interview

Even if Sutton's platonic ideal doesn't end up being the path to the first AGI,

Dwarkesh Podcast
Some thoughts on the Sutton interview

His first principles critique is identifying some genuine basic gaps that these models have.

Dwarkesh Podcast
Some thoughts on the Sutton interview

And we don't even notice them because they're so pervasive in the current paradigm, but because he has this decades-long perspective, they're obvious to him.

Dwarkesh Podcast
Some thoughts on the Sutton interview

It's the lack of continual learning.

Dwarkesh Podcast
Some thoughts on the Sutton interview

It's the abysmal sample efficiency of these models.

Dwarkesh Podcast
Some thoughts on the Sutton interview

It's their dependence on exhaustible human data.

Dwarkesh Podcast
Some thoughts on the Sutton interview

If the LLMs do get to HEI first, which is what I expect to happen, the successor systems that they build will almost certainly be based on Richard's vision.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Today, I'm chatting with Richard Sutton, who is one of the founding fathers of reinforcement learning and inventor of many of the main techniques used there, like TD learning and policy gradient methods.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And for that, he received this year's Turing Award, which, if you don't know, is basically the Nobel Prize for Computer Science.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Richard, congratulations.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Thank you, Dvarkis.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And thanks for coming on the podcast.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

It's my pleasure.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Okay, so first question is,

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

My audience and I are familiar with the LLM way of thinking about AI.