Richard Sutton

👤 Speaker

See mentions of this person in podcasts

505 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Okay, so those two things.

1951.49 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And then there's also the perception component, which is the construction of your state representation, your sense of where you are now.

1953.894 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And the fourth one is what we're really getting at, most transparently anyway.

1963.249 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The fourth one is the transition model of the world.

1967.496 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

That's why I am uncomfortable just calling everything models, because I want to talk about the model of the world.

1971.042 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

the transition model of the world your belief that if you do this what will happen what will be the consequences of what you do so your physics of the world but it's all it's not just physics it's also um abstract models like you know your model of how you traveled um from california up to edmonton for this podcast that was a model and that's a transition model and that would be uh learned and it's not learned from reward it's learned from you did things you saw what happened yeah you made that model of the world

1976.11 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

That will be learned very richly from all the sensation that you receive, not just from the reward.

2002.975 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It has to include the reward as well, but that's a small part of the whole model, small crucial part of the whole model.

2010.498 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The idea is totally general.

2073.209 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I do use all the time, as my canonical example, the idea of an AI agent is like a person.

2075.552 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And people, in some sense, they have just one world they live in.

2082.783 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And that world may involve chess and it may involve Atari games.

2089.132 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But those are not a different task or a different world.

2094.66 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Those are different states that they encounter.

2097.344 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

2099.527 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so the general idea is not limited at all.

2099.888 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

They just set it up.

2117.067 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It was not their ambition to have one agent across those games.

2118.49 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

If we want to talk about transfer, we should talk about transfer, not across games or across tasks, but transfer between states.

2126.104 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We're not seeing transfer anywhere.

2148.749 View full episode →

← Previous Page 14 of 26 Next →

Report any issue