Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Azeem Azhar

👤 Speaker
6838 total appearances

Appearances Over Time

Podcast Appearances

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

The company wasn't formally acquired.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

It's kind of complicated.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

But that

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

Grok acquisition really, really pointed to the changing shape of the AI market.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

Up until that point, NVIDIA had survived on a single, all the evolving architecture, that is the GPU, the graphics processing unit.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

It was its heritage coming out of video games.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

And GPUs are great at many things, but it had been coming clear over the last year or so

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

that they might not be fantastic for the changing shape of AI use as we move towards inference.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

So now, this is a technical bit of my discussion.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

So when you think about what happens in inference, I think it's worth just unpicking this because it'll explain what's going on.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

There are a couple of phases.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

The first phase is called pre-fill.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

That's when you send a prompt to a model, whether it's a question or a document to summarize or some complex instruction.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

The model reads and processes your input token simultaneously in parallel.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

This is enormously compute intensive and it is where GPUs shine.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

They were built for graphics.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

throwing huge matrices of pixels at thousands of cores at once, doing it all in parallel.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

And that is the shape of the pre-fill problem.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

So when you're feeding context in, GPUs are doing what they were made to do.

Azeem Azhar's Exponential View
What NVIDIA’s bet on OpenClaw means for the future of AI and your token budget

But the second phase of inference is called decode, and that is the generation of the responses.