Andrej Karpathy

Andrej Karpathy — AGI is still a decade away

Whereas anything that happens in the context window of the neural network, you're plugging all the tokens and it's building up all this KV cache representation, is very directly accessible to the neural net.

1089.555 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

So I compare the KV cache and the stuff that happens at test time to more like a working memory.

1097.325 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Like all the stuff that's in the context window is very directly accessible to the neural net.

1102.973 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

So there's always like these...

1107.519 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

almost surprising analogies between LLMs and humans.

1109.782 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

And I find them kind of surprising because we're not trying to build a human brain, of course, just directly.

1112.652 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

We're just finding that this works and we're doing it.

1116.807 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

But I do think that...

1118.994 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Anything that's in the weights, it's kind of like a hazy recollection of what you read a year ago.

1120.733 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

Anything that you give it as a context at test time is directly in the working memory.

1124.658 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

And I think that's a very powerful analogy to think through things.

1129.905 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

So when you, for example, go to an LLM and you ask it about some book and what happened in it, like Nick Lane's book or something like that, the LLM will often give you some stuff, which is roughly correct.

1132.588 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

But if you give it the full chapter and ask it questions, you're going to get much better results because it's now loaded in the working memory of the model.

1141.4 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

So I basically agree with your very long way of saying that I kind of agree, and that's why.

1147.548 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

I almost feel like just a lot of it still.

1159.985 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

So maybe one way to think about it.

1165.991 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

I don't know if this is the best way, but I almost kind of feel like, again, making these analogies, imperfect as they are.

1167.272 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

We've stumbled by with the transformer neural network, which is extremely powerful, very general.

1173.638 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

You can train transformers on audio or video or text or whatever you want, and it just learns patterns, and they're very powerful, and it works really well.

1179.003 View full episode →

Dwarkesh Podcast

Andrej Karpathy — AGI is still a decade away

That, to me, almost indicates that this is kind of like some piece of cortical tissue.

1186.63 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment