Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Andrej Karpathy

๐Ÿ‘ค Speaker
3419 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So when you're reading a book, I almost don't even feel like the book is like exposition I'm supposed to be attending to and training on.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

The book is a set of prompts for me to do synthetic data generation.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

or for you to get into a book club and talk about it with your friends.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And it's by manipulating that information that you actually gain that knowledge.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And I think we have no equivalent of that, again, with LLMs.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

They don't really do that, but I'd love to see during pre-training some kind of a stage that thinks through the material and tries to reconcile it with what it already knows and thinks through for some amount of time and gets that to work.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And so there's no equivalence of any of this.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

This is all research.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

There's some subtle, very subtle that I think are very hard to understand reasons why it's not trivial.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So if I can just describe one.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Why can't we just synthetically generate and train on it?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Well, because every synthetic example, like if I just give synthetic generation of the model thinking about a book, you look at it and you're like, this looks great.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Why can't I train on it?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Well, you could try, but the model will actually get much worse if you continue trying.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And that's because all of the samples you get from models are silently collapsed.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

They're silently, this is not obvious if you look at any individual example of it, they occupy a very tiny manifold of the possible space of sort of thoughts about content.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So the LLMs, when they come off, they're what we call collapsed.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

They have a collapsed data distribution.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

If you sample, one easy way to say it is go to ChatGPT and ask it, tell me a joke.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

It only has like three jokes.