Zaid
๐ค SpeakerAppearances Over Time
Podcast Appearances
And then AI inference is like hiring that person to do a job and answer questions.
And this shift to inference is impacting the AI infrastructure spending.
According to research from Gartner, global capital spending on inference infrastructure is going to surpass spending on training for the first time ever this year.
And this gap is expected to widen fast.
By 2029, companies are projected to spend nearly twice as much on inference compared to training.
The thing is, these AI models are now so good that there's less urgency to train new ones.
Instead, there's more focus on getting the most out of these existing AI models.
Now, the other side effect of the rise of agentic AI and the focus on inference is that this is gonna exponentially increase the demand for compute.
So if you think about it, a normal AI chatbot gives you one response, but an AI agent needs to think through the tasks, call multiple tools, check its own work, retry if something fails, pull in outside data, and keep running in the background to complete the work.
All that requires a lot more tokens and a lot more computing power.
And it also requires a different type of hardware.
Most people know about Nvidia's GPUs.
They're really good at training AI models.
Well, inference and agentic workloads are different.
That requires CPUs working alongside GPUs to get the maximum performance.
The CPU acts as a brain that can think and execute multi-step tasks.
It can call external tools like APIs and databases and manage memory better.
And that's why Nvidia is now making CPUs themselves alongside their GPUs.
NVIDIA became the most valuable company in the world because their GPUs were the best at training AI models.
The company's data center revenue, which represents their AI chips business, has grown nearly 13x since the launch of ChatGPT.