Satya Nadella
👤 SpeakerAppearances Over Time
Podcast Appearances
Once you start crossing pre-training data center boundaries, is it that different than anything else?
Right?
So therefore, so the way I think about it is, hey, distributed computing will remain distributed.
So go build out your fleet such that
It's ready for large training jobs.
It's ready for test time compute.
It's ready.
In fact, if this RL thing, the thing that might happen is you build one large model, and then after that, there's tons of this RL going on and test.
To me, it's kind of like, again, more training flops, because you want to create these highly specialized distilled models for different tasks.
So you want that fleet.
And then the serving needs, right?
At the end of the day, speed of light is speed of light.
So you can't sort of have one data center in Texas and say, I'm going to serve the world from there.
You got to serve the world based on having an inference fleet everywhere in the world.
Right.
So that's kind of how I think of our build out a true hyperscale fleet.
Oh, and by the way, I want my storage and compute also close to all of these things because it's not just
AI accelerators that are stateless because I need to be able to have not just my training data itself needs storage.
And then I want to be able to multiplex multiple training jobs.
I want to be able to then have memory.