Andrej Karpathy

Andrej Karpathy — AGI is still a decade away

It's like the practically possible version with our technology and what we have available to us to get to a starting point where we can actually do things like reinforcement learning and so on.

725.003 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

So it's subtle, and I think you're right to push back on it.

762.699 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

But basically, the thing that pre-training is doing, so you're basically getting the next token predictor over the internet, and you're training that into a neural net.

765.103 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

It's doing two things actually that are kind of like unrelated.

772.518 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Number one, it's picking up all this knowledge, as I call it.

775.284 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Number two, it's actually becoming intelligent.

777.65 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

By observing the algorithmic patterns in the internet, it actually kind of like boots up all these like little circuits and algorithms inside the neural net to do things like in-context learning and all this kind of stuff.

780.777 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

And actually, you don't actually need or want the knowledge.

789.376 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

I actually think that's probably actually holding back the neural networks overall, because it's actually like getting them to rely on the knowledge a little too much sometimes.

792.485 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

For example, I kind of feel like agents, one thing they're not very good at is going off the data manifold of what exists on the internet.

798.802 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

If they had less knowledge or less memory, actually maybe they would be better.

805.019 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

And so what I think we have to do kind of going forward, and this would be part of the research paradigms, is actually think we need to start, we need to figure out ways to remove some of the knowledge and to keep what I call this cognitive core.

809.265 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

It's this like intelligent entity that is kind of stripped from knowledge but contains the algorithms and contains the magic, you know, of intelligence and problem solving and the strategies of it and all this kind of stuff.

820.402 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

I think I'm hesitant to say that in-context learning is not doing gradient descent because, I mean, it's not doing explicit gradient descent, but I still think that, so in-context learning, basically, it's pattern completion within a token window, right?

888.351 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

And it just turns out that there's a huge amount of patterns on the internet.

901.055 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

And so you're right, the model kind of like learns to complete the pattern, right?

903.159 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

And that's inside the weights.