AE Natarajan
๐ค SpeakerAppearances Over Time
Podcast Appearances
AI is automatically a distributed element.
We all know that a single GPU does not carry the AI workload.
So you have to build clusters of GPUs to do AI.
Now with power and space constraint, you can't put the clusters geographically located.
It has to be distributed.
And we talk about AI build-outs that scale up within a rack of a data center.
scale out within the data center, interconnecting all these GPUs within the data center.
And now we're talking the third dimension of scale across geographies, where the network elements and the network pieces have to actually tie in and build the scale across.
And you bring a very interesting point here about scale across, where when you're scaling across or you're scaling up or scaling out in the data centers, you need to make sure that the GPUs are constantly being used and delivered
their ability to compute.
If they are waiting on data or they're waiting on the network to give data, then you're losing cycles on the GPU.
And those are the most expensive things.
You need the ROI for it.
It performs extremely well in distributed architectures and AI by nature is very distributed and it renders itself for the distributed architecture.
So you have to really architect the system.
so that you have to build this in such a way that it is efficient at every stage of the network so that you get the maximum benefit and the maximum ROI of the large investment that people are doing with AI.
Why don't we just move all AI infrastructure to the edge?
I just talked about the factory, right?
You still have to do training.