Sholto Douglas

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

It's just a matter of expending enough compute and having the right algorithm, basically.

648.291 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

You know the parable about when you choose to launch a space mission?

673.778 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

How you should acquire, go further up the tech tree, because if you launch later on, your ship will go faster and this kind of stuff?

677.582 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I think it's quite similar to that.

684.571 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

You want to be sure that you've algorithmically got the right thing.

685.492 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And then when you bet and you do the large compute spend on the run, then it'll actually pay off.

688.776 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

You'll have the right compute efficiencies and this kind of stuff.

693.582 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And I think RL is slightly different to pre-training in this regard, where RL can be a more iterative thing.

695.585 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

You're progressively adding capabilities to the base model.

701.851 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Pre-training has, in many respects, if you're halfway through a run and you've messed it up, then you've really messed it up.

703.772 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

But I think that's the main reason why, is people are still figuring out exactly what they want to do.

712.18 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I mean, 01 to 03, OpenAI put in their blog post that it was a 10x compute multiplier over 01.

717.805 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So clearly they bet on one level of compute, and they were like, OK, this seems good.

723.85 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Let's actually release it.

730.364 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Let's get it out there.

731.125 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And then they spent the next few months increasing the amount of compute that they spent on that.

732.488 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And I expect, as everyone is, that everyone else is scaling up RL right now.

735.815 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So I basically don't expect that to be true for very long.

740.685 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

You literally do have a monkey, and it's making Shakespeare.

828.357 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I was just going to say, like, you do need to be able to get reward sometimes in order to learn.

837.751 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment