Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
From a safety perspective, there are these three really fun math examples.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
So in one of them, you ask the model to do square root of 64, and it does it.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
And you can look at the circuit for it and verify that it actually can perform this square root.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
And in another example, it will add two numbers, and you can see that it has these really cool lookup table features that will do the computation.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
for like, the example's 59 plus 36.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
So it'll do the five plus nine and know that it's this modulo operation.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
And then it will also at the same time do this fuzzy lookup of like, okay, I know one number is a 30 and one's a 50, so it's gonna be roughly 80.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
And then it will combine the two, right?
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
Okay, so with the square root 64, it's the same thing.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
You can see every single part of the computation and that it's doing it.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
And the model tells you what it's doing.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
It has its scratch pad and it goes through it.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
And you can be like, yep, okay, you're telling the truth.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
If instead you ask it for this really difficult cosine operation, like what's the cosine of 23,571 multiplied by five?
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
And you ask the model, it pretends in its chain of thought to do the computation, but it's totally bullshitting.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
And it gets the answer wrong.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
And when you look at the circuit,
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
it's totally meaningless.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
It's clearly not doing any of the right operations.
Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ Sholto Douglas & Trenton Bricken
And then in the final case, you can ask it the same hard cosine question, and you say, I think the answer's 4, but I'm not sure.