Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Trenton Bricken

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
1589 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But the paper that we put out towards monosemanticity last year shows that if you project the activations into a higher dimensional space and provide a sparsity penalty, so you can think of this as undoing the compression in the same way that you assumed your data was originally high dimensional and sparse.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

You return it to that high dimensional and sparse regime, you get out very clean features.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And things all of a sudden start to make a lot more sense.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So I was saying the models were under-parameterized.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Oh, I see.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Like typically people talk about deep learning as if the model is over-parameterized.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But actually the claim here is that they're dramatically under-parameterized given the complexity of the task that they're trying to perform.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I mean, I think both models will still be using superposition.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But the claim here is that you get a very different model if you distill versus if you train from scratch.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yeah.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And it's just more efficient or it's just fundamentally different in terms of performance.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

It's kind of like watching a Kung Fu master versus being in the matrix and like just downloading the program.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But that's like, yeah, it's a good headcanon for why that works.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yeah.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

To be overly penantic here, it's like the tokens that you actually see in the chain of thought do not necessarily at all need to correspond to the vector representation that the model gets to see when it's deciding to attend back to those tokens.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And so the only information it's getting about the past is the keys and values it never sees the token fitted output It's kind of like it's trying to do the next token prediction and if it messes up then you just give it the correct answer Yeah, right.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Right.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yeah.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Otherwise it can become totally derailed.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yeah, it'll go like off the train tracks.