Kwasi Ankomah

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

How much is it costing?

805.762 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

And I think this is what we're talking about.

807.003 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

Everyone is really, and this is what we all keep talking about, inference and efficiency and power.

808.745 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

How much is this costing?

814.152 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

And we, our architectures, we come from the fundamental thing of saying, we know at scale how much it's going to cost.

815.533 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So we've already kind of thought about that.

822.842 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

And the folks who do the hardware really kind of thought about that.

824.424 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

For sure.

848.05 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So I think like, you know, I'll talk about probably the key two areas, right?

849.011 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So one of them is around the chip itself, right?

854.638 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So we've done something, what we call three-tier memory.

858.183 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So that three-tier memory is basically SRAM and that's like super fast, you know, on chip integrated with the compute.

862.608 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

We've got HVM and then we've got DDR.

871.079 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So the HPM is the intelligent caching layer, and then we've got the DDR, which is like this massive capacity, like 1.5 terabytes that we've got as well.

874.443 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So why does this matter?

882.217 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

It's because traditional GPU move data between memory and compute.

884.241 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So it's like, you're constantly doing this back and forth, back and forth.

890.352 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So it's like having your tools in the garage when you're working in the kitchen.

892.636 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

You have to constantly go, oh, yeah, you know, I need a screwdriver.

898.827 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

Now, like that, that's inefficient.

902.896 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment