Dwarkesh Patel

Andrej Karpathy — AGI is still a decade away

That's actually surprising that you think it will take a billion, because already we have a billion parameter models, or a couple billion parameter models that are like very intelligent.

3595.013 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Well, some of our models are like a trillion parameters, right?

3603.042 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

But they remember so much stuff.

3605.245 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Yeah, but I'm surprised that in 10 years, given the pace, okay, we have GPT-OSS-20b, that's way better than GPT-4 original, which was a trillion plus parameters.

3607.568 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Yeah.

3621.444 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

So given that trend, I'm actually surprised you think in 10 years, the cognitive core is still a billion parameters.

3621.524 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Yeah, I'm surprised you're not like, oh, it's going to be like tens of millions or millions.

3626.746 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

But why is the distilled version still a billion?

3695.93 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Is I guess the thing I'm curious about.

3698.794 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Why would you train on... Right, no, no, but why is the distillation in 10 years not getting below 1 billion?

3705.98 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Oh, you think it should be smaller than a billion?

3711.349 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Yeah, I mean, just like if you look at the trend over the last few years, just finding low-hanging fruit and going from like trillion-plus models that are like literally two orders of magnitude smaller in a matter of two years and having better performance.

3722.673 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

It makes me think the sort of core of intelligence might be even way, way smaller.

3735.229 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Like plenty of room at the bottom, to paraphrase Feynman.

3740.656 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Yeah.