Andrej Karpathy
π€ SpeakerAppearances Over Time
Podcast Appearances
How can I simplify it?
How can I have a simple thing that actually shows that thing, right?
It shows an action.
And then I can tack on the other terms.
Maybe an example from one of my repos that I think illustrates it well is called micrograd.
I don't know if you're familiar with this.
So micrograd is 100 lines of code that shows backpropagation.
You can create neural networks out of simple operations like plus and times, et cetera, Lego blocks of neural networks.
And you build up a computational graph, and you do a forward pass and a backward pass to get the gradients.
Now, this is at the heart of all neural network learning.
So MicroGrad is a 100 lines of pre-interpretable Python code, and it can do forward and backward arbitrary neural networks, but not efficiently.
So MicroGrad, these 100 lines of Python, are everything you need to understand how neural networks train.
Everything else is just efficiency.
Everything else is efficiency.
And there's a huge amount of work to do efficiency.
You need your tensors, you lay them out, you stride them, you make sure your kernels are orchestrating memory movement correctly, et cetera.
It's all just efficiency, roughly speaking.
But the core intellectual sort of piece of neural network training is micrograph.
It's 100 lines.
You can easily understand it.