Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Andrej Karpathy

πŸ‘€ Speaker
3433 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

How can I simplify it?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

How can I have a simple thing that actually shows that thing, right?

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

It shows an action.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And then I can tack on the other terms.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Maybe an example from one of my repos that I think illustrates it well is called micrograd.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

I don't know if you're familiar with this.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

So micrograd is 100 lines of code that shows backpropagation.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

You can create neural networks out of simple operations like plus and times, et cetera, Lego blocks of neural networks.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And you build up a computational graph, and you do a forward pass and a backward pass to get the gradients.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Now, this is at the heart of all neural network learning.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

So MicroGrad is a 100 lines of pre-interpretable Python code, and it can do forward and backward arbitrary neural networks, but not efficiently.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

So MicroGrad, these 100 lines of Python, are everything you need to understand how neural networks train.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Everything else is just efficiency.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

Everything else is efficiency.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

And there's a huge amount of work to do efficiency.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

You need your tensors, you lay them out, you stride them, you make sure your kernels are orchestrating memory movement correctly, et cetera.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

It's all just efficiency, roughly speaking.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

But the core intellectual sort of piece of neural network training is micrograph.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

It's 100 lines.

Dwarkesh Podcast
Andrej Karpathy β€” AGI is still a decade away

You can easily understand it.