Sholto Douglas

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Because you use more compute and as you train on more and more difficult tasks.

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I mean, I don't know, your rate of improvement of biology is going to be somewhat bound by the time it takes a cell to grow in a way that your rate of improvement on math isn't, for example.

7968.877 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So, yes.

7980.996 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

But I think for many things we'll be able to parallelize widely enough and get enough iteration loops.

7983.627 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

That depends on whether or not you think there's a virtue in pre-training a new architecture.

8001.194 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Basically, if you make some architectural change, then you probably need to do some form of at least retraining a new model.

8007.388 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

But there's a trade-off equation here, right?

8046.858 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

There is science to do, which everyone is doing, of what is the optimal point at which to do RL.

8049.942 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Because you need something which can both learn and discover the sparse reward itself.

8056.652 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So you don't want a one parameter model.

8063.02 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Useless, even though you can run it really fast.

8064.442 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

You also don't want a 100T model because it's super slow.

8066.886 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

MARK MANDELMANN- Yeah, password RL.

8070.01 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And the marginal benefit of its learning efficiency is not worth it.

8073.961 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So there's a pretty big frontier here.

8077.692 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

What's the optimal model size of your current class of capabilities and your current set of RL environments and this kind of stuff?

8079.557 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Yeah, my total pool of compute, how do I allocate that across train data compute and inference compute for the RL training?

8108.422 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Yeah.

8146.285 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So I think, once again, it's worth considering the spectrum of possible worlds and preparing yourself for that.

8147.426 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And the sort of action that I think is highest EV in that case is you are about to get dramatic, at a minimum, you are about to get dramatically more leverage.

8152.612 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment