Jensen Huang
π€ SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
Jeder in der Welt ist jetzt Programmierer.
Das ist das Wunder.
Das ist das Wunder der Artifizierten Intelligenz.
FΓΌr das erste Mal haben wir den Gap geschlossen.
Die Technologie-Divide ist komplett geschlossen.
Yeah, thanks for that question.
So first of all, the reason why extreme co-design is necessary is because the problem no longer fits inside one computer to be accelerated by one GPU.
The problem that you're trying to solve is you would like to go faster than the number of computers that you add.
So you added 10,000 computers, but you would like it to go a million times faster.
Then all of a sudden, you have to take the algorithm...
You have to break up the algorithm.
You have to refactor it.
You have to shard the pipeline.
You have to shard the data.
You have to shard the model.
Now, all of a sudden, when you distribute the problem this way, not just scaling up the problem, but you're distributing the problem.
then everything gets in the way.
This is the Amdahl's law problem, where the amount of speed up you have for something depends on how much of the total workload it is.
And so if computation represents 50% of the problem, and I sped up computation infinitely, like a million times, I only sped up the total workload by a factor of two.
Now, all of a sudden, not only do you have to distribute the computation, you have to shard the pipeline somehow, you also have to solve the networking problem because you've got all of these computers are all connected together.