Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Sholto Douglas

👤 Person
1567 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And so maybe the way I would define it now is the thing that's holding them back is if you can give it a good feedback loop for the thing that you want it to do, then it's pretty good at it.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

If you can't, then they struggle a bit.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Yes.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So the big thing that really worked over the last year is –

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Maybe broadly, the domain is called RL from verifiable rewards or something like this, where a clean reward signal.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So the initial unhoppling of language models was RL from human feedback, where typically it was something like pairwise feedback or something like this, and the outputs of the models became closer and closer to things that humans wanted.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

But this doesn't necessarily improve their performance at any level.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

like difficulty of problem domain, right?

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Particularly as humans are actually quite bad judges of what a better answer is.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Humans have things like length biases and so forth.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So you need a signal of whether the model was correct in its output that is...

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

that is like quite true, let's say.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And so things like the correct answer to a math problem or unit tests, parsing, this kind of stuff.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

These are the examples of a reward signal that's very clean, but even these can be hacked, by the way.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Like even unit tests, the models find ways around it to like,

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

hack in particular values and hard code values of unit tests if they can figure out what the actual test is doing.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

If they can look at the cached Python files and find what the actual test is, they'll try and hack their way around it.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So, these aren't perfect, but they're much closer.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

In part because software engineering is...

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

It's very verifiable.