Dwarkesh Patel

👤 Speaker

15656 total appearances

Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1

Confidence: Medium

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast

Some thoughts on the Sutton interview

With LLMs, we're going the opposite way.

639.703 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

We have first made this base model that does pure imitation learning, and then we're hoping that we do enough RL on it to make a coherent agent with goals and self-awareness.

641.567 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

Maybe this won't work.

650.967 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

But I don't think these super first principles arguments about, for example, how these LMs don't have a true world model are actually proving much.

652.409 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

And I also don't think they're strictly accurate for the models we have today, which are actually undergoing a lot of RL on ground truth.

659.94 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

Even if Sutton's platonic ideal doesn't end up being the path to the first AGI,

666.529 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

His first principles critique is identifying some genuine basic gaps that these models have.

670.795 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

And we don't even notice them because they're so pervasive in the current paradigm, but because he has this decades-long perspective, they're obvious to him.

676.143 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

It's the lack of continual learning.

683.193 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

It's the abysmal sample efficiency of these models.

684.615 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

It's their dependence on exhaustible human data.

687.079 View full episode →

Dwarkesh Podcast

Some thoughts on the Sutton interview

If the LLMs do get to HEI first, which is what I expect to happen, the successor systems that they build will almost certainly be based on Richard's vision.

690.264 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Today, I'm chatting with Richard Sutton, who is one of the founding fathers of reinforcement learning and inventor of many of the main techniques used there, like TD learning and policy gradient methods.

0.031 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And for that, he received this year's Turing Award, which, if you don't know, is basically the Nobel Prize for Computer Science.

10.59 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Richard, congratulations.

16.681 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Thank you, Dvarkis.