Andrej Karpathy
๐ค SpeakerAppearances Over Time
Podcast Appearances
Most of the tasks that we want of them don't actually demand the diversity.
It's probably the answer of what's going on.
And so it's just that the frontier labs are trying to make the models useful.
And I kind of just feel like the diversity of the outputs is not so much.
Number one, it's much harder to work with and evaluate and all this kind of stuff.
But maybe it's not what's actually capturing most of the value.
Or like maybe if you're doing a lot of writing help from LLMs and stuff like that, I think it's probably bad because the models will give you these like silently all the same stuff, you know.
So they're not, they won't explore lots of different ways of answering a question, right?
But I kind of feel like maybe this diversity is just not as big of a, yeah, maybe like, yeah, not as many applications needed so the models don't have it, but then it's actually a problem with synthetic generation time, et cetera.
So we're actually shooting ourselves in the foot by not allowing this entropy to maintain in the model.
And I think possibly the labs should try harder.
I don't actually know if it's super fundamental.
I don't actually know if I intended to say that.
I do think that...
I haven't done these experiments, but I do think that you could probably regularize the entropy to be higher.
So you're encouraging the model to give you more and more solutions.
But you don't want it to start deviating too much from the training data.
It's going to start making up its own language.