Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Andrej Karpathy

๐Ÿ‘ค Speaker
3433 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

You can hill climb on that without getting expert trajectories to imitate.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So that's amazing.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And the model can also discover solutions that the human might never come up with.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So this is incredible.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And yet, it's still stupid.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So I think we need more.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And so I saw a paper from Google yesterday that tried to have this reflect and review page idea in mind.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

What was the memory bank paper or something?

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

I don't know.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

I've actually seen a few papers along these lines.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So I expect there to be some kind of a major update to how we do algorithms for LLMs coming in that realm.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And then I think we need three or four or five more.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Something like that.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So process-based supervision just refers to the fact that we're not going to have a reward function only at the very end of after you've made 10 minutes of work, I'm not going to tell you you did well or not well.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

I'm going to tell you at every single step of the way how well you're doing.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And this is basically the reason we don't have that.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

It's tricky how you do that properly because you have partial solutions and you don't know how to assign credit.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So when you get the right answer, it's just an equality match to the answer.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Very simple to implement.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

If you're doing basically process supervision, how do you assign, in an automatable way, partial credit assignment?