Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Trenton Bricken

๐Ÿ‘ค Person
1589 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

And this time, the model will go through the same reasoning claiming to do the calculations, and at the end say, you're right, the answer's 4.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

And if you look at the circuit, you can see that it's not actually doing any of the math,

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

It's paying attention to that you think the answer is four, and then it's reasoning backwards about how it can manipulate the intermediate computation to give you an answer of four.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

I've done that.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Who hasn't?

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Who hasn't?

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Totally.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

So I guess there are a few crazy things here.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

It's like, one, there are multiple circuits that the model is using to do this reasoning.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Two, is that you can actually see if it's doing the reasoning or not.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

And three, the scratch pad isn't giving you this information.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Two fun analogies for you.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

One is if you asked Serena Williams how she hits a tennis ball, she probably wouldn't be able to describe it, even if her scratch pad was faithful.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

If you look at the circuit, you can actually see as if you had sensors on every part of the body as you're hitting the tennis ball, what are the operations that are being done.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

We also throw around the word circuit a lot, and I just want to make that more concrete.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

So this is features across layers of the model all working in cooperation to perform a task.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

And so a fun analogy here is you've got the Ocean's Eleven bank heist team in a big crowd of people.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

The crowd of people is all the different possible features.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

And you could โ€“ we're trying to pick out in this crowd of people who is on the heist team.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

and all their different functions that need to come together in order to successfully break into the bank, right?