Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Andrej Karpathy

๐Ÿ‘ค Speaker
3419 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

They will recite passages from all these training sources.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

You can give them completely nonsensical data, like you can hash some amount of text or something like that.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

You get a completely random sequence.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

If you train on it, even just, I think, a single iteration or two, it can suddenly regurgitate the entire thing.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

It will memorize it.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

There's no way a person can read a single sequence of random numbers and recite it to you.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And that's a feature, not a bug almost, because it forces you to like only learn the generalizable components.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Whereas LLMs are distracted by all the memory that they have of the pre-trained documents.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And it's probably very distracting to them in a certain sense.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So that's why when I talk about the cognitive core, I actually want to remove the memory, which is what we talked about.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

I'd love to have less memory so that they have to look things up.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And they only maintain the algorithms for like thought and the idea of an experiment and all this cognitive glue of acting.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

I'm not sure.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

I think it's almost like a separate axis.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

It's almost like the models are way too good at memorization and somehow we should remove that.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And I think people are much worse, but it's a good thing.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Yeah, I think that's a great question.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

I mean, you can imagine having a regularization for entropy and things like that.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

I guess they just don't work as well empirically because right now, like, the models are collapsed.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

But I will say...