Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Sholto Douglas

๐Ÿ‘ค Speaker
1567 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

MARK MANDELMANN- And constantly move the goalposts.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Now, that being said, one caveat on that is if software engineering is just dramatically better than computer use, I mean, computer use still sucks, then I'd be still like, oh, maybe everyone just kept focusing on software engineering.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

It was just by far the most valuable thing.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Every marginal person and dollar went towards software engineering.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

I don't think that's the case.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

I do think computer use is valuable enough that people will care about it.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

But that's my one escape hatch that I'm putting in place for next year.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Oh, like as in if the models didn't get any better?

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

So one intuition pump is this conversation was had a lot when models were like GPT-2 sized and fine-tuned for various things.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

And people would find that the models were dramatically better at things that they were fine-tuned for.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

But by the time you get to GPT-4, when it's trained on a wide enough variety of things, actually the โ€“

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

like the sort of total compute, like it generalized very well across all of the individual subtasks and actually generalized better than smaller fine-tuned models in a way that was extremely useful.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

I think right now what we're seeing with RL is pretty much the same story playing out, where there's this jaggedness of things that they're particularly trained at, but as we expand the total amount of compute that we do RL with,

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

you'll start to see the same transition from GPT-2 fine tunes to GPT-3, GPT-4, unsupervised meta learning and generalization across things.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

And I think we're already seeing early evidence of this in its ability to generalize reasoning to things.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

I think this will be like extremely obvious.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

MARK MANDELMANN- Well, I mean, have we ever RL'd the model to be an interp agent?

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

No.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

FRANCESC CAMPOY- I mean, no.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Yeah, exactly.