Sholto Douglas

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And a ongoing challenge would be imbuing taste into the models and setting up the right feedback loops such that you can actually do that.

3160.192 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Maybe the best public example is actually a paper that OpenAI put out recently where they judge the answers to medical questions using these like grading criteria feedback.

3198.496 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So there's like doctors have posed various questions and then there's all these like it's like a marking criteria for a long for like a short answer question in an exam where did the model mention X, Y, Z?

3210.106 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Did it recommend to do this kind of thing?

3220.675 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And they grade the model according to this.

3223.338 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And in this paper, they found that one, the models are like incredible at this.

3226.801 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And two, that the models are sufficient to grade the answers.

3232.834 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Because maybe like one good mental model is roughly

3238.826 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

If you can construct a grading criteria that an everyday person off the street could do, then the models are probably capable of interpreting that criteria.

3243.676 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

If it requires expertise and taste, that's a tougher question.

3253.928 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

In viewing, is this a wonderful piece of art?

3258.273 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

That's difficult.