Andrej Karpathy
๐ค SpeakerAppearances Over Time
Podcast Appearances
And you have to make sure it flies.
And you can squeeze in as much compute as you can into the tiny...
Yeah, I'm not sure if there's too many insights.
You're trying to create a neural net that will fit in what you have available, and you're always trying to optimize it.
And we talked a lot about it on AI Day and basically the triple backflips that the team is doing to make sure it all fits and utilizes the engine.
So I think it's extremely good engineering.
And then there's all kinds of little insights peppered in on how to do it properly.
Yeah, the data engine is what I call the almost biological feeling process by which you perfect the training sets for these neural networks.
So because most of the programming now is in the level of these data sets and makes sure they're large, diverse, and clean, basically you have a data set that you think is good.
You train your neural net, you deploy it, and then you observe how well it's performing.
And you're trying to always increase the quality of your data set.
So you're trying to catch scenarios that are basically rare.
And it is in these scenarios that neural nets will typically struggle in because they weren't told what to do in those rare cases in the data set.
But now you can close the loop because if you can now collect all those at scale, you can then feed them back into the reconstruction process I described and reconstruct the truth in those cases and add it to the dataset.
And so the whole thing ends up being like a staircase of improvement, of perfecting your training set.
And you have to go through deployments so that you can mine the parts that are not yet represented well in the dataset.
So your data set is basically imperfect.
It needs to be diverse.
It has pockets that are missing, and you need to pad out the pockets.
You can sort of think of it that way.