Trenton Bricken

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And so you can kind of imagine this big tree of semantic concepts where like biology splits into like cells versus like whole body biology.

9618.055 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And then further down, it splits into all these other things.

9626.991 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So rather than needing to immediately go from a thousand to a million and then picking out that one feature of interest, you can find the direction that the biology feature is pointing in, which again is very coarse, and then selectively search around that space.

9628.954 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Um, so like only do dictionary learning if this, if something in the direction of the biology feature fires first.

9642.558 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And so, um, the, the computer science metaphor here would be like, instead of doing breadth first search, you're able to do depth first search where you're only recursively expanding and exploring a particular part of this like semantic tree of features.

9649.828 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So I haven't read the Mistral paper, but I think that the heads, I mean, this goes back to like, if you just look at the neurons in a model, they're polysemantic.

9712.235 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And so if all they did was just look at the neurons in a given head, it's very plausible that it's also polysemantic because of superposition.

9721.506 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So this is a line of work that we haven't pursued as much as I want to yet.

9739.268 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But I think we're planning to.

9744.413 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I hope that maybe external groups do as well.

9745.574 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

What is the geometry of feature space?

9748.437 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

What's the geometry?

9749.738 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Exactly.

9750.399 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And how does that change over time?

9750.639 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Inject more structure into the geometry.

9768.899 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Totally.

9770.262 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I mean, it would really surprise me, I guess, especially given how linear the model seems to be.

9770.522 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Completely agree.

9774.169 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

That there isn't some component of the anthrax feature, like vector, that is similar to and looks like the biology vector, and that they're not in a similar part of the space.

9774.851 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But yes, I mean, ultimately...

9783.368 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment