Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

πŸ‘€ Speaker
15444 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Now, maybe the lifetime learning is not analogous to RL, to your point.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Is that compatible with the thing you were saying, or would you disagree with that?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Just to steel man the other perspective, because after doing this in an interview and thinking about it a bit, he has an important point here.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Evolution does not give us the knowledge, really, right?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

It gives us the algorithm to find the knowledge.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And that seems different from pre-training.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

So if perhaps the perspective is that pre-training helps build the kind of entity which can learn better, it teaches meta-learning.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

and therefore it is similar to like finding an algorithm.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

But if it's like evolution gives us knowledge and pre-training gives us knowledge, that analogy seems to break down.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

There's so much interesting stuff there.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Okay, so let's start with in-context learning.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

This is an obvious point, but I think it's worth just like saying it explicitly and meditating on it.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

The situation in which these models seem the most intelligent, in which they are like, I talk to them and I'm like, wow, there's really something on the other end that's responding to me thinking about things.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

If it like makes a mistake, it's like, oh, wait, that's actually the wrong way to think about it.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

I'm backing up.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

All that is happening in context.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

That's where I feel like the real intelligence you can like visibly see.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And that in context learning process is developed by gradient descent on pre-training, right?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Like it spontaneously meta learns in context learning.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

But the in context learning itself is not gradient descent in the same way that our lifetime intelligence as humans to be able to do things is conditioned by evolution.