Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Dwarkesh Patel

๐Ÿ‘ค Person
12579 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

That's actually surprising that you think it will take a billion, because already we have a billion parameter models, or a couple billion parameter models that are like very intelligent.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Well, some of our models are like a trillion parameters, right?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

But they remember so much stuff.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Yeah, but I'm surprised that in 10 years, given the pace, okay, we have GPT-OSS-20b, that's way better than GPT-4 original, which was a trillion plus parameters.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Yeah.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So given that trend, I'm actually surprised you think in 10 years, the cognitive core is still a billion parameters.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Yeah, I'm surprised you're not like, oh, it's going to be like tens of millions or millions.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

But why is the distilled version still a billion?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Is I guess the thing I'm curious about.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Why would you train on... Right, no, no, but why is the distillation in 10 years not getting below 1 billion?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Oh, you think it should be smaller than a billion?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Yeah, I mean, just like if you look at the trend over the last few years, just finding low-hanging fruit and going from like trillion-plus models that are like literally two orders of magnitude smaller in a matter of two years and having better performance.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

It makes me think the sort of core of intelligence might be even way, way smaller.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Like plenty of room at the bottom, to paraphrase Feynman.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Yeah.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So we're discussing what, like, plausibly could be the cognitive core.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

There's a separate question, which is, what will actually be the size of furniture models over time?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And I'm curious to have a prediction.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So we had increasing scale up to maybe 4.5, and now we're seeing decreasing slash plateauing scale.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

There's many reasons that could be going on.