Andrej Karpathy
π€ SpeakerAppearances Over Time
Podcast Appearances
And then these neural nets take on pretty surprising magical properties.
Yeah, I think it's kind of interesting how much you can get out of even very simple mathematical formalism.
Well, it's definitely some kind of a generative model that's GPT-like and prompted by you.
So you're giving me a prompt, and I'm kind of responding to it in a generative way.
Well, it definitely feels like you're referencing some kind of a declarative structure of memory and so on.
And then...
you're putting that together with your prompt and giving away some answers.
Nothing, basically, right?
Yeah, could be.
I mean, I'm using phrases that are common, etc., but I'm remixing it into a pretty sort of unique sentence at the end of the day.
But you're right, definitely there's like a ton of remixing.
I mean, it's kind of interesting because I'm simultaneously underselling them, but I also feel like there's an element to which I'm over, like, it's actually kind of incredible that you can get so much emergent magical behavior out of them despite them being so simple mathematically.
So I think those are kind of like two surprising statements that are kind of juxtaposed together.
And I think basically what it is, is we are actually fairly good at optimizing these neural nets.
And when you give them a hard enough problem, they are forced to learn very interesting solutions in the optimization.
And those solutions basically have these emergent properties that are very interesting.
It's a lot of knobs.
And somehow, you know, so speaking concretely, one of the neural nets that people are very excited about right now are GPTs, which are basically just next word prediction networks.
So you consume a sequence of words from the Internet and you try to predict the next word.
And once you train these on a large enough data set...