Andrej Karpathy
๐ค SpeakerAppearances Over Time
Podcast Appearances
Like so bad that I don't even know how anything works, to be honest.
Like look at the average example in the training set.
Like factual mistakes, errors, nonsensical things.
Somehow when you do it at scale, the noise washes away and you're left with some of the signal.
um so data sets will improve a ton it's just everything gets better so um our hardware um our all the kernels um all the kernels for running the hardware and maximizing what you get with the hardware you know so nvidia is slowly tuning the actual hardware itself tensor course and so on all that needs to happen and will continue to happen uh all the kernels will get better and utilize the chip to the max extent all the algorithms will probably improve over optimization architecture and just all the modeling components of how everything is done and what the algorithms are that we're even training with
So I do kind of expect like a just very just everything.
Nothing dominates.
Everything plus 20%.
Right.
Interesting.
This is like roughly what I've seen.
So I guess I have two answers to that.
Number one, I'm almost tempted to, like, reject the question entirely because, again, like, I see this as an extension of computing.
Have we talked about, like, how to chart progress in computing or how do you chart progress in computing since 1970s or whatever?
What is the x-axis?
So I kind of feel like the whole question is kind of, like, funny from that perspective a little bit.
But I will say, I guess, like, when people talk about AI and the original AGI and how we spoke about it when we โ when OpenAI started โ
AGI was a system you can go to that can do any task that is economically valuable, any economically valuable task at human performance or better.
Okay.
So that was the definition.