Trenton Bricken

👤 Speaker

See mentions of this person in podcasts

1589 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Even with the feature discussion, defining what a feature is is really hard.

7813.748 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And so the question feels almost too slippery.

7818.778 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

What is a feature?

7823.129 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

A direction and activation space.

7824.952 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

A latent variable that is operating behind the scenes that has causal influence over the system you're observing.

7828.338 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

It's a feature if you call it a feature.

7838.976 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

It's tautological.

7841.941 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

These are all explanations that I feel some...

7844.365 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

If that neuron corresponds to... To something in particular.

7869.907 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Right.

7872.531 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yeah, yeah, yeah.

7872.911 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And no, I think that's useful as like, what do we want a feature to be, right?

7873.913 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Like what is a synthetic problem under which a feature exists?

7877.538 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But even with the Towards Monosemanticity work, we talk about what's called feature splitting, which is basically you will find as many features as you give the model the capacity to learn.

7880.543 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And by model here, I mean the up projection that we fit after we trained the original model.

7892.08 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And so if you don't give it much capacity, it'll learn a feature for bird.

7899.03 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But if you give it more capacity, then it will learn like ravens and eagles and sparrows and like specific types of birds.

7903.557 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I'm not sure what we would mean by... I mean, all of those things are like discrete units that have connections to other things that then imbues them with meaning.

7948.615 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

That feels like a specific enough definition that it's useful or not too all-encompassing, but feel free to push back.

7961.931 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I mean, if the features we were finding weren't predictive or if they were just representations of the data, right, where it's like, oh, all you're doing is just clustering your data and there's no like higher level associations that are being made or it's some like phenomenological thing of like,

7978.689 View full episode →

← Previous Page 62 of 80 Next →

Report any issue