Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Richard Sutton

👤 Person
505 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

It's the content of the knowledge is statements about the stream.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And so because it's a statement about the stream, you can test it by comparing it to the stream and you can learn it continually.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So when you're imagining this future continual learning agent.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

They're not future.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Of course, they exist all the time.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

This is what reinforcement learning paradigm is, learning from experience.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

The reward function is arbitrary.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And so if you're playing chess, it's to win the game of chess.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

If you're a squirrel, maybe the reward has to do with getting nuts.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

In general, for an animal, you would say the reward is to avoid pain and to acquire pleasure.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And there's also would be a component having to do with, I think there should be a component having to do with your increasing understanding of your environment.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

That would be sort of an intrinsic motivation.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I don't like the word model when used the way you just did.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I think a better word would be the network.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So I think you mean the network.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Maybe there's many networks.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So anyway, things would be learned and then you'd have copies and many instances.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And sure, you'd want to share knowledge across all.