Jensen Huang
๐ค SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
And what people don't realize is they've kind of forgotten that most of the data that we are training, that we teach each other with, inform each other with, is synthetic.
You know, it's synthetic because it didn't come out of nature.
You created it.
I'm consuming it.
I modify it, augment it.
I regenerate it.
Somebody else consumes it.
And so we've now reached a level where AI is able to take ground truth, augment it,
enhance it, synthetically generate an enormous amount of data.
And that part of post-training continues to scale.
And so the amount of data that we could use that is human-generated will be smaller and smaller and smaller.
The amount of data that we use to train model is going to continue to scale to the point where we're no longer limited.
Training is no longer limited by data.
It's now limited by compute.
And the reason for that is most of the data is synthetic.
Then the next phase is test time.
And I still remember people telling me that inference, oh yeah, that's easy.
Pre-training, that's hard.
These are giant systems that people are talking about.
Inference must be easy.