Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Andrej Karpathy

๐Ÿ‘ค Speaker
3419 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And so the way I like to put it is you're sucking supervision through a straw because you've done all this work that could be a minute to roll out and you're like sucking the bits of supervision of the final reward signal through a straw and you're like putting it, you're like, you're basically like, yeah, you're broadcasting that across the entire trajectory and using that to upweigh or downweigh that trajectory.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

It's crazy.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

A human would never do this.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Number one, a human would never do hundreds of rollouts.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Number two, when a person sort of finds a solution, they will have a pretty complicated process of review of like, okay, I think these parts that I did well, these parts I did not do that well.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

I should probably do this or that.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And they think through things.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

There's nothing in current LLMs that does this.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

There's no equivalent of it.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

But I do see papers popping out that are trying to do this because it's obvious to everyone in the field.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

So I kind of see it as like, the first imitation learning actually, by the way, was extremely surprising and miraculous and amazing that we can fine-tune by imitation in humans.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And that was incredible.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Because in the beginning, all we had was base models.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Base models are autocomplete.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And it wasn't obvious to me at the time, and I had to learn this, and the paper that blew my mind was InstructGPT.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

Because it pointed out that, hey, you can take the pre-trained model, which is autocomplete,

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And if you just fine-tune it on text that looks like conversations, the model will very rapidly adapt to become very conversational.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And it keeps all the knowledge from pre-training.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

And this blew my mind because I didn't understand that this just like stylistically can adjust so quickly and become an assistant to a user through just a few loops of fine-tuning on that kind of data.

Dwarkesh Podcast
Andrej Karpathy โ€” AGI is still a decade away

It was very miraculous to me that that worked.