Andy Halliday
👤 SpeakerAppearances Over Time
Podcast Appearances
So when we chat with ChatGPT, whether through Azure or directly with ChatGPT, we're calling on a data center to run the models and give us back the responses.
And, you know, when you do a chat, it's not a collaborative chat.
My guess is, which was another feature that released just this week, which would change this model.
But when you do a chat right now online with
chat GPT, you might be talking to a different data center that's running a different version of the model, slightly different version of the model that I'm talking to, right?
Because it takes time to update all the models across all the data centers.
So there's that difference.
And the fact that they've just added collaborative means that all of the people involved, wherever they are in the world, have to be talking to the same inference engine.
Right.
So that means I think that it has to be happening within the same data center because of this multiplexity of the way that we access these models.
Those models are being posted, for example, by Cloudflare at the edge of this content delivery network.
You can access models that way as well.
So anyway.
I'm rambling, so let's get to the stats.
It's really curious, which is in the first half of 2025, OpenAI spent $5 billion with Azure.
as a source of inference generation.
And that is continuing through the end of September, so where this is being tracked.
And that company's inference cost has risen consistently over the past 18 months.
So it's up to $5 billion for a six-month period.
And its inference costs easily outweigh