Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Trenton Bricken

๐Ÿ‘ค Person
1589 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

I wonder if โ€“ I should almost test, like, would an LLM have made that mistake?

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Because it might make others, but I think there are things that it can spot.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Like, it would have no problem if I asked it to read through the entire tax code and then see what applied to me.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

I mean, you can already think of models forever.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

People have been calling the residual stream and multiple layers like poor man's adaptive compute.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

We're like, if the model already knows the answer to something, it will compute that in the first few layers and then just pass it through.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

I mean, that's getting into the weeds.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Right.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Yeah.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

I think it's important to delineate between the model's planning in latent space in a single forward pass and the model has an alien language that it's outputting and using as its scratchpad.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Which one are we talking about?

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

No, but in the most extreme cases, it invents a new language that's super information dense.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

I mean, that's what's so fun about the, if you look at the assistant tag, right?

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Seeing these features light up in the auditing game for the model being evil.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Yeah.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

Transluce has another example of this where you ask a Lama model, who is Nicholas Carlini?

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

And background context, Nicholas Carlini is a researcher who actually was a deep mind and has now come over to Anthropic.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

But the model says, oh, I don't know who that is.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

I couldn't possibly speculate.

Dwarkesh Podcast
Is RL + LLMs enough for AGI? โ€” Sholto Douglas & Trenton Bricken

But if you look at the features behind the scenes, you see a bunch light up for AI, computer security, all the things that Nicholas Carlini does.