Trenton Bricken

And I threw in the analogy earlier of the Ocean 11 heist team where now you're identifying individual features across the layers of the model that are all working together to perform some complicated task.

6618.478 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And you can get a much better idea of how it's actually doing

6631.943 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

the reasoning and coming to decisions, like with the medical diagnostics.

6635.45 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

One example I didn't talk about before is with like how the model retrieves facts.

6639.253 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And so you say like, what sport did Michael Jordan play?

6644.878 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And not only can you see it hop from like Michael Jordan to basketball, answer basketball, but the model also has an awareness of when it doesn't know the answer to a fact.

6648.342 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

And so by default, it will actually say, I don't know the answer to this question.

6659.752 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

But if it sees something that it does know the answer to, it will inhibit the I don't know circuit and then reply with the circuit that it actually has the answer to.

6665.437 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

So, for example, if you ask it who is Michael Batkin, which is just a made-up fictional person, it will by default just say I don't know.

6674.593 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

It's only with Michael Jordan or someone else that it will then inhibit the I don't know circuit.

6683.769 View full episode →

Dwarkesh Podcast

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

But what's really interesting here and where you can start making downstream predictions or reasoning about the model is that that I don't know circuit is only on the name of the person.

6688.117 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment