Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Andrej Karpathy

๐Ÿ‘ค Speaker
3419 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So incredible.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And that was like two years, three years of work.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And now came RL.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And RL allows you to do a bit better than just imitation learning, right?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Because you can't have these reward functions and you can hill climb on the reward functions.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And so some problems have just correct answers.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

You can hill climb on that without getting expert trajectories to imitate.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So that's amazing.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And the model can also discover solutions that the human might never come up with.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So this is incredible.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And yet, it's still stupid.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So I think we need more.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And so I saw a paper from Google yesterday that tried to have this reflect and review page idea in mind.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

What was the memory bank paper or something?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

I don't know.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

I've actually seen a few papers along these lines.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So I expect there to be some kind of a major update to how we do algorithms for LLMs coming in that realm.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And then I think we need three or four or five more.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Something like that.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So process-based supervision just refers to the fact that we're not going to have a reward function only at the very end of after you've made 10 minutes of work, I'm not going to tell you you did well or not well.