Trenton Bricken

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I think we can attack it, but we're going to need to be persistent.

10910.668 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And the real hope here is, I think, automated interpretability.

10916.458 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And even having debate, right?

10920.637 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

You could have the debate set up where two different models are debating what the feature does, and then they can actually go in and make edits and see if it fires or not.

10922.139 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But it is just this wonderful closed environment that we can iterate on really quickly.

10931.112 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I mean, bus factor doesn't define how long it would take to recover from it, right?

11105.102 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And deep learning research is an art.

11110.042 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And so you kind of learn how to read the loss curves or set the hyperparameters in ways that empirically seem to work well.

11112.164 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

That is like difficult to share.

11184.93 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yeah, if it works well, it's probably not being published.

11186.924 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yeah, I do think the tide is changing there for whatever reason.

11285.845 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And like Neil Nanda has had a ton of success promoting interpretability in a way where like Chris Ola hasn't been as active recently in pushing things, maybe because Neil's just doing quite a lot of the work.

11289.109 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But like, I don't know, four or five years ago, he was like really pushing and like talking at all sorts of places and these sorts of things.

11300.264 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And people weren't anywhere near as receptive.

11307.034 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Maybe they've just woken up to like deep learning matters and is clearly useful post-chat GPT, but...