Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Richard Sutton

👤 Person
505 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

the instances.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And there would be lots of possibilities for doing that.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Like there is not today.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

You can't have one child grow up and learn about the world and then every new child has to repeat that process.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Whereas with AIs, with the digital intelligence, you could hope to do it once and then copy it into the next one as a starting place.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So this would be a huge savings and I think actually it would be much more important than trying to learn from people.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So this is something we know very well, and the basis of it is temporal difference learning, where the same thing happens in a less grandiose scale, like when you learn to play chess.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

The long-term goal is winning the game, and yet you want to be able to learn from shorter-term things, like taking your opponent's pieces.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And so you do that by having a value function, which predicts the long-term outcome.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And then if you take the guy's pieces, well, your prediction about the long-term outcome is changed.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

It goes up.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

You think you're going to win.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And then that increase in your belief changes.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

immediately quote reinforces the uh the move that led to taking the piece okay so we have this long-term 10-year goal of making a startup and making a lot of money and so when we make progress we say oh i'm i'm i'm more likely to uh achieve the long-term goal and that rewards the the steps along the way

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I think the crux of this, and I'm not sure, but...

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

The big world hypothesis seems very relevant, and the reason why humans become useful on their job is because they are encountering the particular part of the world, and it can't have been anticipated, and it can't all have been put in in advance.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

The world is so huge that you can't... The dream, as I see it, the dream of large language models is you can teach the agent everything and it will know everything and it won't have to learn anything online.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

right during its life okay and and your examples are all well really you have to because you can there's a lot to you can teach it but there's all little idiosyncrasies of the particular life they're leading and the the particular people they're working with and what they like as opposed to what average people like right and so that's just saying the world is really big and so you're going to have to learn it uh along the way

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And I'm- So I would say you're just doing regular learning.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Maybe using context, because in large language models, all that information has to go into the context window.