Kwasi Ankomah

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

that is just not acceptable, right?

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

Most production applications don't have a latency budget of 20 to 30 seconds, especially if there's a user interacting with it.

255.612 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So I would say that the user experience has become a key factor.

263.561 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

As these things go into production, how do we actually make sure that the user is having a good time and that we can scale it, right?

266.885 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

Like what happens then when we, instead of having 200 users, we have 2000 users, right?

273.833 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

Because again,

278.539 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

you're running inference and you need to scale that inference to the amount of users.

279.38 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So that becomes a problem because usually the more users you have, the more the hardware is under pressure in order to get the inference out and the slower the model goes again.

283.706 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So I would say that that's probably the biggest one.

294.121 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

And the second is kind of the reality of where is the cost going?

297.847 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

I always thought, I came from a financial background and

303.816 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

When AI first came, everyone was like, it's a bit like the clouds, you know, this is great.

308.202 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

You know, no one was checking their bills, you know, and now you see what inference is costing you.

312.108 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

And suddenly you're like, well, hang on, like this inference is becoming most of our cost.

317.877 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So actually.

322.865 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

And, you know, milliseconds of difference actually can mean millions in operational cost.

324.487 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

Like when you scale it up to kind of, you know, 20 million users, 30 million users, these getting out tokens faster is just going to cost you less money.

329.815 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So, you know, one of the things here is that inference itself is becoming an expensive thing.

339.991 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

So those would be kind of the two things I think from a business perspective and from a thing that people can relate to is that you're not going to have a good experience if you have slow AI and it's going to start costing you money.

344.618 View full episode →

The Neuron: AI Explained

AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

The last thing I'd also say is that there's now certain applications and I'll use voice because I think it's the real, you know, as...

356.354 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment