Nathaniel Whittemore
π€ SpeakerAppearances Over Time
Podcast Appearances
The chip will feature 20 CPU cores married to over 6,000 integrated GPU cores, supporting up to 128GB of unified memory.
The chip is capable of delivering one petaflop of AI compute, which by comparison, an H100 outputs around four petaflops.
Nvidia said that the chip will be available in Windows PCs and laptops by the fall, with models available from Asus, Dell, HP, Lenovo, and Microsoft at launch.
Pricing was not announced, but these will be premium devices designed to compete with high-end Mac products and gaming computers that pull double duty as inference workhorses.
I saw a lot of people basically saying that this is their competition to the M5 series of Mac computers.
Now, in some ways, this is part of a trend and a shift in the compute workload as we prioritize inference.
GPUs, of course, have been the core piece of AI hardware for almost a decade, but the CPU is now having a resurgence.
GPUs are increasingly seen as hardware for AI training, while powerful CPUs are better for executing agentic tool calls.
Kara Brisky, NVIDIA's VP for GenAI Software, said of GPU-powered chatbots, that era is ending.
Agents are the new workload.
They will run everywhere from the data center to the edge.
Now, in addition to those new machines, NVIDIA CEO Jensen Huang also announced that Vera Rubin had entered full production.
OpenAI and Anthropic have already taken delivery of their first units, with plans to scale up into full data center buildouts this year.
The Vera Rubin nomenclature refers to the CPU-GPU pairing in the chip.
Vera is the CPU, while Rubin is the GPU architecture.
For the first time for an NVIDIA data center chip, the focus is on the CPU and its ability to supercharge agentic AI.
Said Huang, AI agents will be the largest users of computing.
Vera is the first CPU designed for that future, built to run agentic AI at hyperscale with extraordinary performance, efficiency, and programmability.
Making that comparison that I just mentioned, The Verge argued that the RTX Spark could be the M1 moment for Windows.
i.e., until now, Apple's M-series chips have been the go-to for running AI models locally, and now in its fifth generation, the M-series architecture has been utterly dominant.