Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Sholto Douglas

๐Ÿ‘ค Speaker
1567 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

to step back and explain.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

By adaptive compute, the idea is one of the things you would want models to be able to do is, if a question is harder, to spend more cycles thinking about it.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And so then how do you do that?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Well, there's only a finite and predetermined amount of compute that one forward pass implies.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So if there's a complicated reasoning type question or math problem, you want to be able to spend a long time thinking about it.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

then you do chain of thought where the model just like thinks through the answer.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And you can think about it as like all those forward passes where it's like thinking through the answer.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

It's like being able to dump more compute into solving the problem.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Now, going back to the signal thing, when it's doing chain of thought, it's only able to transmit that token of information where it's like, as you were talking about, the residual stream is already a compressed representation of everything that's happening in the model.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And then you're turning the residual stream into one token.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

which is like log of 50,000 or log of vocab size bits, which is like, yeah, so tiny.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So I don't think it's quite only transmitting like that one token, right?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Is the claim that when you fine-tune on chain of thought, the key and value weights change so that the sort of steganography can happen in the KV cache?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I don't think I could make that strong a claim.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Okay, that makes sense.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

How How much like this sort of secret communication with the model to its forward

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

uh, forward inferences, how much, how much steganography and, you know, like secret communication do you expect there to be?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

but also the chain of thought like yeah it gets a better answer at the end of the chain of thought rather than not doing it at all so like something useful is happening but still the useful thing is not human understandable um i think in some cases you can also just ablate the chain of thought and it would have given the same answer anyways interesting um

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But isn't this how humans think as well?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

The famous split brain experiments where, you know, like when a person who is suffering from seizures, one way to solve it is you cut the