Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

๐Ÿ‘ค Speaker
14445 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

which then does the lifetime learning.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Now, maybe the lifetime learning is not analogous to RL, to your point.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Is that compatible with the thing you were saying, or would you disagree with that?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Just to steel man the other perspective, because after doing this in an interview and thinking about it a bit, he has an important point here.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Evolution does not give us the knowledge, really, right?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

It gives us the algorithm to find the knowledge.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And that seems different from pre-training.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So if perhaps the perspective is that pre-training helps build the kind of entity which can learn better, it teaches meta-learning.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

and therefore it is similar to like finding an algorithm.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

But if it's like evolution gives us knowledge and pre-training gives us knowledge, that analogy seems to break down.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

There's so much interesting stuff there.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Okay, so let's start with in-context learning.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

This is an obvious point, but I think it's worth just like saying it explicitly and meditating on it.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

The situation in which these models seem the most intelligent, in which they are like, I talk to them and I'm like, wow, there's really something on the other end that's responding to me thinking about things.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

If it like makes a mistake, it's like, oh, wait, that's actually the wrong way to think about it.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

I'm backing up.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

All that is happening in context.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

That's where I feel like the real intelligence you can like visibly see.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And that in context learning process is developed by gradient descent on pre-training, right?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Like it spontaneously meta learns in context learning.