Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

πŸ‘€ Speaker
15787 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 4
Confidence: High

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

It sounds like you're saying that when we do have generalization in these models, that is a result of some sculpted... Humans did it.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I'm not trying to kickstart this initial crux again, but I'm just genuinely curious because I think I might be using the term differently.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I mean, one way to think about it is these LLMs are increasing the scope of generalization from like earlier systems, which could not really even do a basic math problem to now they can do anything in this class of math Olympia type problems, right?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So you initially start with like they can generalize among addition problems, at least.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Then you generalize to like they can generalize among like problems that require use of different kinds of mathematical techniques and theorems and conceptual categories, which is like what the math Olympiad requires.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And so it sounds like you don't think of being able to solve any problem within that category necessarily.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

as an example of generalization?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Or let me know if I'm misunderstanding that.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

My understanding is that this is working better and better with coding agents.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So engineers, obviously, if you're trying to program a library,

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

There's many different ways you could achieve the end spec.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And an initial frustration with these models has been that they'll do it in a way that's sloppy.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And then over time, they're getting better and better at coming up with the design architecture and the abstractions that developers find more satisfying.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And it seems an example of what you're talking about.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So to prep for this interview, I wanted to understand the full history of RL, starting with reinforce up to current techniques like GRPO.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And I didn't just want a list of equations and algorithms.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I wanted to really understand each change in this progression and the underlying motivation.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

You know, what was the main problem that each successive method was actually trying to solve?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So I had Gemini Deep Research walk me through this entire timeline step by step.