Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

πŸ‘€ Speaker
15787 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 4
Confidence: High

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

I'm like learning about my clients, my company, all this information.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So...

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

The question I'm trying to ask is, you need some way of getting, like, how many bits per second are you picking up?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Like, is a human picking up when they're, you know, out in the world, right?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

If you're just, like, interacting over Slack with your clients and everything.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So what is the learning process which helps you capture that information?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

One of my friends, Toby Ward, pointed out that if you look at the Muse Euro models that Google DeepMind deployed to learn Atari games, that these models were initially not a general intelligence itself, but a general framework for training specialized intelligences.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

to play specific games.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

That is to say that you couldn't, using that framework, train a policy to play both chess and Go and some other game.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

You had to train each one in a specialized way.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And he was wondering whether that implies that reinforcement learning generally, because of this information constraint, you can only learn one thing at a time, the density of information isn't that high, or whether it was just specific to the way that MuZero was done.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

And if it's specific to AlphaZero, what needed to be changed about that approach so that it could be a general learning agent?

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

So maybe it would be useful to explain what was missing in that architecture or that approach, which this continual learning AGI would have.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah, I guess I'm curious about...

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Historically, have we seen the level of transfer using RL techniques that would be needed to build this kind of... Okay, good, good.

Dwarkesh Podcast
Richard Sutton – Father of RL thinks LLMs are a dead-end

Let me paraphrase to make sure that I understood that correctly.