Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Sholto Douglas

๐Ÿ‘ค Speaker
1567 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Then like the whole thing gets multiplied together and the whole thing becomes much less likely to happen.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Wait, wait.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So doesn't the fact that there's these companies, Google and, I don't know, Magic, maybe others, who have million token attention imply that the quadratic... You shouldn't say anything.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Doesn't that imply that it's not quadratic anymore or are they just eating the cost?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Okay, so what do you make of this take?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

As we move forward through the takeoff, more and more of the learning happens in the forward pass.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So originally, like all the learning happens in the backward, you know, during like this like bottom up sort of hill climbing evolutionary process.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

If you think in the limit during the intelligence explosion, it's just like the AI is maybe handwriting the weights or doing go-fi or something.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And we're in the middle step where a lot of learning happens in context now with these models.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

A lot of it happens within the backward process.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Does this seem like a meaningful gradient along which progress is happening?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Because the broader thing being, if you're learning in the forward path, it's much more sample efficient because you can basically think as you're learning.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

When humans, when you read a textbook, you're not just skimming it and trying to absorb what inductive, these words follow these words.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

You read it and you think about it, and then you read some more, you think about it.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I don't know.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Does this seem like a sensible way to think about the progress?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

This is actually an interesting point.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So when we talk about scaling up these models, how much of it comes from just making the models themselves bigger?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And how much comes from the fact that during any single call, you are using more compute?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So if you think of diffusion, you can just iteratively keep adding more compute.