Sholto Douglas

But like GPT-4 Turbo, you could make the claim that actually GPT-4 Turbo is worse at reasoning style stuff than GPT-4, but probably knows the same facts, like the distillation got rid of some of the reasoning things.

4230.013 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Oh, okay.

4247.941 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yeah.

4249.224 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yeah.

4254.554 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

What is the, how do you like interpret what's happening in distillation?

4255.596 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I think Warren had one of these questions on his website.

4259.384 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Why can't you train the distilled model directly?

4261.608 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Why does it have to go through?

4264.233 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Is it, is a picture like you had to project it from this bigger space to a smaller space?

4265.957 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I don't remember, but do you know?

4288.593 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yeah, exactly, exactly.

4343.917 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yep, yep.

4345.218 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Just to make sure the audience got that, when you're training on a distilled model, you're like, you see all its probabilities over the tokens it was predicting and then over the ones you were predicting and then you like update through all those probabilities rather than just seeing the last word and updating on that.

4346.479 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Okay, so this actually raises a question I was intending to ask you.

4362.573 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Right now, I think you were the one who mentioned you can think of chain of thought as adaptive compute of like...

4365.641 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment