Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Kwasi Ankomah

👤 Person
536 total appearances

Appearances Over Time

Podcast Appearances

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

And again, at a small scale, maybe not a huge problem.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

At a big scale, with thousands of compute units, that does begin to add up.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So that...

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

That innovation is big.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

And the reason that's big is it allows us to store bigger models.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So to give you an example of this, the DeepSeek models, so that would be DeepSeek R1 and DeepSeek V3, they're 670 billion parameters, right?

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

Absolutely huge.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

Now, a lot of providers don't serve that model.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

They can't physically.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

And we can, basically because of the way that we've architected our chip.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

And again, the folks that started the company, they had all of this in mind when they designed the chip.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So that's the big thing.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

It allows us to run very large models.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

Now, the second thing that that allows us to do is it allows us to run many models.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So that DDR bit allows us to store.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So if you imagine that you, you know, a GPU or alternative architecture could only store like this one model.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

And in order for you to get another model, you need another unit of computing, another kind of, you know, let's call it a node.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

Now, because of our kind of large DDR, it allows us to kind of store these other models so you can switch.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

And this becomes super important for agentic applications because you might have an application that maybe uses the GPT-OSS model that we're running at the moment, or it might use a Lama 8B, but we can have those on the same node.

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So your inference and hardware cost stays flat because we can go and get the model.