Dwarkesh Patel

👤 Speaker

15787 total appearances

Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 4

Confidence: High

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It sounds like you're saying that when we do have generalization in these models, that is a result of some sculpted... Humans did it.

2188.242 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

2201.119 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I'm not trying to kickstart this initial crux again, but I'm just genuinely curious because I think I might be using the term differently.

2258.902 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I mean, one way to think about it is these LLMs are increasing the scope of generalization from like earlier systems, which could not really even do a basic math problem to now they can do anything in this class of math Olympia type problems, right?

2267.436 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So you initially start with like they can generalize among addition problems, at least.

2282.56 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Then you generalize to like they can generalize among like problems that require use of different kinds of mathematical techniques and theorems and conceptual categories, which is like what the math Olympiad requires.

2286.848 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so it sounds like you don't think of being able to solve any problem within that category necessarily.

2300.713 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

as an example of generalization?

2306.523 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Or let me know if I'm misunderstanding that.

2308.205 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

My understanding is that this is working better and better with coding agents.

2361.571 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So engineers, obviously, if you're trying to program a library,

2368.099 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

There's many different ways you could achieve the end spec.

2372.926 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And an initial frustration with these models has been that they'll do it in a way that's sloppy.

2375.851 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And then over time, they're getting better and better at coming up with the design architecture and the abstractions that developers find more satisfying.

2380.94 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And it seems an example of what you're talking about.

2389.695 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So to prep for this interview, I wanted to understand the full history of RL, starting with reinforce up to current techniques like GRPO.

2427.418 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And I didn't just want a list of equations and algorithms.