Andrej Karpathy
π€ SpeakerAppearances Over Time
Podcast Appearances
That's how it was originally developed.
At the end of the day, it's a mathematical expression.
And it's a fairly simple mathematical expression when you get down to it.
It's basically a sequence of matrix multiplies, which are really dot products mathematically.
And some non-linearity is thrown in.
And so it's a very simple mathematical expression.
And it's got knobs in it.
Many knobs.
Many knobs.
And these knobs are loosely related to basically the synapses in your brain.
They're trainable.
They're modifiable.
And so the idea is we need to find the setting of the knobs that makes the neural net do whatever you want it to do, like classify images and so on.
And so there's not too much mystery, I would say, in it.
You might think that...
Basically, you don't want to endow it with too much meaning with respect to the brain and how it works.
It's really just a complicated mathematical expression with knobs, and those knobs need a proper setting for it to do something desirable.
Yeah, I think that's fair.
So basically, I'm underselling it by a lot because you definitely do get very surprising emergent behaviors out of these neural nets when they're large enough and trained on complicated enough problems.
Like say, for example, the next word prediction in a massive dataset from the internet.