Andrej Karpathy
๐ค SpeakerAppearances Over Time
Podcast Appearances
It's not giving you the whole breadth of possible jokes.
It's giving you like, it knows like three jokes.
They're silently collapsed.
So basically, you're not getting the richness and diversity and the entropy from these models as you would get from humans.
So humans are a lot more sort of noisier, but at least they're not biased.
They're not in a statistical sense.
They're not silently collapsed.
They maintain a huge amount of entropy.
So how do you get synthetic data generation to work despite the collapse and while maintaining the entropy is a research problem.
Say we have a chapter of a book and I ask an alum to think about it.
It will give you something that looks very reasonable.
But if I ask it 10 times, you'll notice that all of them are the same.
Yeah, yeah, yeah.
So any individual sample will look okay, but the distribution of it is quite terrible.
Interesting.
And it's quite terrible in such a way that if you continue training on too much of your own stuff, you actually collapse.
I actually think that there's no like fundamental solutions to this possibly.
And I also think humans collapse over time.
I think this is, again, these analogies are surprisingly good, but humans collapse during the course of their lives.
This is why children have completely, you know, they haven't overfit yet.