Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Sholto Douglas

👤 Person
1567 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So it's...

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

it's an efficiency question there.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Obviously, if you could give a dense reward for every token, if you had a supervised example, then that's one of the best things you could have.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

But in many cases, it's very expensive to produce all of those scaffolded curriculum of everything to do.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Having PhD math students grade students is something which you can only afford for the select category of students that you've chosen to focus in on developing.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And you couldn't do that for all the language models in the world.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So like first step is obviously that would be better, but

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

you're going to be sort of optimizing this pre-order frontier of how much am I willing to spend on the scaffolding versus how much am I willing to spend on pure compute.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Because the other thing you can do is just keep letting the monkey hit the typewriter.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And if you have a good enough end reward, then eventually it will find its way.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And so I can't really talk about where exactly

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

people sit on that scaffold.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I think different people, different tasks are on different points there.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And a lot of it depends on how strong your prior over the correct things to do is.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

But that's the equation you're optimizing.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

It's like, how much am I willing to burn compute versus how much am I willing to burn dollars on people's time to give scaffolding or give awards.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

MARK MANDELMANN- Interesting.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

We are willing to do this for LLMs to some degree.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

But there's a...

Dwarkesh Podcast
Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

There's an equation you're maximizing here.