Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Sholto Douglas

๐Ÿ‘ค Speaker
1567 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

If you think the tokens are equivalent.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Yeah.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Which, you still get pretty substantial numbers, like, even with your 100 million H100s, and you multiply that by 100, you're starting to, like, get to pretty substantial numbers.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

This does mean that those models themselves will be, like, somewhat compute-bonded in many respects.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

But these are all, like,

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

These are relatively short-term changes in timelines of progress, basically.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

I think, yes, it's highly likely we get dramatically intranced bottlenecked in 27, 28.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

the impulse to that will then be, OK, let's just try and turn out as many possible semiconductors as we can.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

There'll be some lag there.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

A big part of how fast we can do that will depend on how much people are feeling the AGI in the next two years as they're building out fab capacity.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

A lot will depend on how is the Taiwan situation.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Is Taiwan still producing all the fabs, the chips?

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Yeah, this is like bimodal distribution.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Yeah.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

A conversation I had with Leopold turned into a section in a situation where it's called This Decade or Bust, which is on exactly this topic, which is basically that for the next couple of years, we can dramatically increase our training compute.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

And RL is going to be so exciting this year because we can dramatically increase the amount of compute that we apply to it.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

And this is also one of the reasons why the gap between, like, say, DeepSeek and O1 was so close at the beginning of the year because they were able to apply the same amount of compute to the RL process.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

And so that compute differential actually will be magnified over the course of this year.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Yeah, they're exactly on the sort of cost curve that you'd expect.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Which is not going to take away from the fact that they're like brilliant engineers and like brilliant researchers who are like, I look at their work and I'm like, ah, like the kindred soul there in the work they're doing.