Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Sholto Douglas

๐Ÿ‘ค Speaker
1567 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I think like he's probably talked about it, but yeah.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

That's actually a very bearish sign because one of the things we were chatting with one of our friends and he made the point that if you look at what new applications are unlocked by GPT-4 relative to GPT-3.5, it's not clear that's like that much.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Like a GPT-3.5 can do perplexity or whatever.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

if there is this diminishing increase in capabilities and that increase costs exponentially more to get, that's actually a bearish sign on what 4.5 will be able to do or what 5 will unlock in terms of economic impact.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Will GoFi be part of the intelligence explosion?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

where you say synthetic data, but in fact, it will be writing its own source code in some important way.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

There was an interesting paper that you can use diffusion to come up with model weights.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I don't know how legit that was or whatever, but I don't know, something like that.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So crucially, the point being that the algorithmic overhead is really high in the sense that, and maybe this is something we should touch on explicitly of, even if you can't keep dumping more compute beyond the models that cost a trillion dollars or something,

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

the fact that the brain is so much more data efficient implies that if you get, we have the compute, if we had like the brain's algorithm to train, if we could like train as a sample efficient as humans train from birth, we could make the AGI.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

How do we think about... What is the explanation of why that would be the case?

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Like a bigger model just sees the exact same data.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

At the end of seeing that data, it's...

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Learn more from it.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

It has more space to represent it.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yeah.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

For the audience, you should unpack why that, first of all, what superposition is and why that is the implication of superposition.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Okay, there's so many interesting threads there.

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

The first thing I want to ask is,

Dwarkesh Podcast
Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

The thing you mentioned about these models are trained in a regime where they're over-parameterized.