Sholto Douglas

First, correct me if I'm wrong, but training the sparse autoencoder and do the unsupervised projection into a wider space of features that have a higher fidelity to what is actually happening in the model.

9535.092 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And then secondly, label those features.

9547.787 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Because let's say like the cost of training the model is N. What will those two steps cost relative to N?

9551.517 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Although given the way that these features are not organized in, um,

9663.988 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

things that are intuitive for humans, right?

9670.317 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Like, cause we just don't have to deal with basics before, so we don't have that many, you know, we just don't dedicate that much, like whatever, firmware to like deconstructing, which kind of basics before it is.

9672.36 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

How would we know that the subjects, and this will go back to maybe the MOE discussion we'll have of,

9681.777 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I guess we might as well talk about it, but like in mixture of experts, the mixture of paper talked about how they couldn't find the experts weren't specialized in a way that we could understand.

9687.867 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

There's not like a chemistry expert or a physics expert or something.

9698.479 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So why would you think that like it will be like biology feature and then deconstruct rather than like blah and then you just deconstruct and it's like anthrax and you're like shoes and whatever.

9702.083 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment