Andrej Karpathy
๐ค SpeakerAppearances Over Time
Podcast Appearances
And when you give them a hard enough problem, they are forced to learn very interesting solutions in the optimization.
And those solutions basically have these emergent properties that are very interesting.
It's a lot of knobs.
And somehow, you know, so speaking concretely, one of the neural nets that people are very excited about right now are GPTs, which are basically just next word prediction networks.
So you consume a sequence of words from the Internet and you try to predict the next word.
And once you train these on a large enough data set...
you can basically prompt these neural nets in arbitrary ways and you can ask them to solve problems and they will.
So you can just tell them, you can make it look like you're trying to solve some kind of a mathematical problem and they will continue what they think is the solution based on what they've seen on the internet.
And very often those solutions look very remarkably consistent, look correct potentially even.
I would say I'm definitely on... I'm much more hesitant with the analogies to the brain than I think you would see potentially in the field.
And I kind of feel like certainly the way neural networks started is everything stemmed from inspiration by the brain.
But at the end of the day, the artifacts that you get after training, they are arrived at by a very different optimization process than the optimization process that gave rise to the brain.
And so I think...
I kind of think of it as a very complicated alien artifact.
It's something different.
The neural nets that we're training.
They are a complicated alien artifact.
I do not make analogies to the brain because I think the optimization process that gave rise to it is very different from the brain.
So there was no multi-agent self-play kind of setup and evolution.
It was an optimization that is basically what amounts to a compression objective on a massive amount of data.