Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Dwarkesh Patel

πŸ‘€ Speaker
15444 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

The best learners that we are aware of, which are children, are extremely bad at recollecting information.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

In fact, at the very earliest stages of childhood, you will forget everything.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

You're just an amnesiac about everything that happens before a certain year date.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

But you're extremely good at picking up new languages and learning from the world.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And maybe there's some element of being able to see the forest for the trees.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Whereas if you compare it to the opposite end of the spectrum, you have...

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

LLM pre-training, which these models were literally able to regurgitate word for word what is the next thing in a Wikipedia page.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

But their ability to learn abstract concepts really quickly the way a child can is much more limited.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And then adults are somewhere in between where they don't have the flexibility of childhood learning, but they can, you know, adults can memorize facts and information in a way that is harder for kids.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And I don't know if there's something interesting about that

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And this is also relevant to preventing model collapse.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Let me think.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Hmm.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

What is a solution to model collapse?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

I mean, there's very naive things you could attempt.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

It's just like the distribution over logits should be wider or something.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Like, there's many naive things you could try.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

What ends up being the problem with the naive approaches?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

In fact, it's actively penalized, right?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

If you're like super creative in RL, it's like not good.