Sholto Douglas

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

MARK MANDELMANN- And constantly move the goalposts.

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Now, that being said, one caveat on that is if software engineering is just dramatically better than computer use, I mean, computer use still sucks, then I'd be still like, oh, maybe everyone just kept focusing on software engineering.

6119.294 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

It was just by far the most valuable thing.

6130.686 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Every marginal person and dollar went towards software engineering.

6132.868 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I don't think that's the case.

6136.032 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I do think computer use is valuable enough that people will care about it.

6136.752 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

But that's my one escape hatch that I'm putting in place for next year.

6142.258 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Oh, like as in if the models didn't get any better?

6156.796 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So one intuition pump is this conversation was had a lot when models were like GPT-2 sized and fine-tuned for various things.

6216.262 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And people would find that the models were dramatically better at things that they were fine-tuned for.

6225.317 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

But by the time you get to GPT-4, when it's trained on a wide enough variety of things, actually the –

6232.249 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

like the sort of total compute, like it generalized very well across all of the individual subtasks and actually generalized better than smaller fine-tuned models in a way that was extremely useful.

6236.997 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I think right now what we're seeing with RL is pretty much the same story playing out, where there's this jaggedness of things that they're particularly trained at, but as we expand the total amount of compute that we do RL with,

6248.455 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

you'll start to see the same transition from GPT-2 fine tunes to GPT-3, GPT-4, unsupervised meta learning and generalization across things.

6260.453 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And I think we're already seeing early evidence of this in its ability to generalize reasoning to things.

6271.556 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I think this will be like extremely obvious.