Andrej Karpathy
๐ค SpeakerAppearances Over Time
Podcast Appearances
You can talk about the heat dissipation of that because your heat dissipation grows as the surface area, which is growing a square, but your heat creation or generation is growing as a cube.
And so I just feel like physicists have all the right cognitive tools to approach problem solving in the world.
So I think because of that training, I always try to find the first order terms or the second order terms of everything.
When I'm observing a system or a thing, I have a tangle of a web of ideas or knowledge in my world, in my mind.
And I'm trying to find what is the thing that actually matters?
What is the first order component?
How can I simplify it?
How can I have a simple thing that actually shows that thing, right?
It shows an action.
And then I can tack on the other terms.
Maybe an example from one of my repos that I think illustrates it well is called micrograd.
I don't know if you're familiar with this.
So micrograd is 100 lines of code that shows backpropagation.
You can create neural networks out of simple operations like plus and times, et cetera, Lego blocks of neural networks.
And you build up a computational graph, and you do a forward pass and a backward pass to get the gradients.
Now, this is at the heart of all neural network learning.
So MicroGrad is a 100 lines of pre-interpretable Python code, and it can do forward and backward arbitrary neural networks, but not efficiently.
So MicroGrad, these 100 lines of Python, are everything you need to understand how neural networks train.
Everything else is just efficiency.
Everything else is efficiency.