Andrej Karpathy
π€ SpeakerAppearances Over Time
Podcast Appearances
And then there's all kinds of little insights peppered in on how to do it properly.
Yeah, the data engine is what I call the almost biological feeling process by which you perfect the training sets for these neural networks.
So because most of the programming now is in the level of these data sets and makes sure they're large, diverse, and clean, basically you have a data set that you think is good.
You train your neural net, you deploy it, and then you observe how well it's performing.
And you're trying to always increase the quality of your data set.
So you're trying to catch scenarios that are basically rare.
And it is in these scenarios that neural nets will typically struggle in because they weren't told what to do in those rare cases in the data set.
But now you can close the loop because if you can now collect all those at scale, you can then feed them back into the reconstruction process I described and reconstruct the truth in those cases and add it to the dataset.
And so the whole thing ends up being like a staircase of improvement, of perfecting your training set.
And you have to go through deployments so that you can mine the parts that are not yet represented well in the dataset.
So your data set is basically imperfect.
It needs to be diverse.
It has pockets that are missing, and you need to pad out the pockets.
You can sort of think of it that way.
in the data.
It really just comes down to extremely good execution from an engineering team who knows what they're doing.
They understand intuitively the philosophical insights underlying the data engine and the process by which the system improves.
And how to, again, like delegate the strategy of the data collection, how that works, and then just making sure it's all extremely well executed.
And that's where most of the work is, is not even the philosophizing or the research or the ideas of it.
It's just extremely good execution is so hard when you're dealing with data at that scale.