Sholto Douglas

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Maybe this is one of the things that the people who were here working on AI research and I think each of the companies is trying to define this for themselves, but it's actually something that broader society can participate in.

2919.693 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

If you take as premise that in a few years we're going to have something that's human level intelligence and you want to imbue that with a certain set of values.

2930.684 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

What should those values be is a question that everyone should be participating in and offering a perspective on.

2938.273 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Like in the constitutionally high paper, it's not just flourishing.

2954.158 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

It's like there's a lot of strictures and there's a lot of like dot points there.

2957.685 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Yeah.

2962.315 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

But it's not an easy question.

2963.517 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I think at the beginning, the hill to climb.

3060.818 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So the reason why people hill climbed Hendrix Math for so long was that there's five levels of problem.

3063.822 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And it starts off reasonably easy.

3069.349 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And so you can both get some initial signal of, are you improving?

3071.512 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And then you have this quite continuous signal, which is important.

3075.918 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Something like Frontier Math is actually

3081.205 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

only makes sense to introduce after you've got something like Hendrix math, that you can max out Hendrix math, and they go, OK, now it's time for frontier math.

3083.6 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

I think in a lot of these cases, you have to hope for some amount of generator-verifier gap.

3131.415 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

You need it to be easier to judge, did you just output a million extraneous files than it is to...

3136.689 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

generate solutions in and of itself.