Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

πŸ‘€ Speaker
15787 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 4
Confidence: High

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So it's performing the task people want, but at the same time, it's learning about the world from doing that task.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And do you imagine, okay, so we get rid of this paradigm where there's training periods and then there's deployment periods.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

But then do we also get rid of this paradigm when there's the model and then instances of the model or copies of the model that are doing certain things?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

How do you think about the fact that we'd want this thing to be doing different things?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

We'd want to aggregate the knowledge that it's gaining from doing those different things.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I agree that the kind of thing you're talking about is necessary regardless of whether you start from LLMs or not, right?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

If you want human or animal level intelligence, you're going to need this capability.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Suppose a human is trying to make a startup, right?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And this is a thing which has a reward on the order of 10 years.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Once in 10 years, you might have an exit where you get paid out a billion dollars.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

But humans have this ability to make intermediate auxiliary rewards or have some way of, even when they have extremely sparse rewards, they can still make intermediate steps, having an understanding of like what the next thing they're doing leads to this grander goal we have.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And so how do you imagine such a process might play out with AIs?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

right and then you also want some ability for information that you're learning i mean one of the things that makes humans quite different from these llms is that if you're onboarding on a job you're picking up so much context and information and that's what makes you useful at the job right you're uh everything from how your client as preferences to how the company works to everything

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And is the bandwidth of information that you get from a procedure like TD learning high enough to have this huge pipe of context and tacit knowledge that you'd need to be picking up in the way humans do when they're just deployed?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So it seems to me you need two things.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

One is some way of converting this long run goal reward into smaller auxiliary or, you know, these like predictive rewards of the future reward or the future reward, at least the final reward.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Then you need some other way.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Initially, it seems to me you need some way of then, OK, I'm

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I need to hold on to all this context that I'm gaining as I'm working in the world, right?