Kwasi Ankomah
👤 PersonAppearances Over Time
Podcast Appearances
And so essentially, once you figure that out, you're able to, I would say, greatly reduce the energy.
And of course, the full stack does make a huge difference.
But I would say making sure that the chip architecture is really sound is a key one as well.
Yeah.
But yeah, I'm not surprised that we're seeing that.
And to give you an example,
the kind of Samba manage that I was talking about, we're seeing so many people kind of come to us being like, hey, we've got this, we can only get, you know, 20 kilowatts of energy and there's just, you know, what can we do with it?
And there's almost no one who can work for those practices.
And, you know, I think we're gonna see more and more of that as we go in.
As we go in and we need to use less energy anyway, for various reasons,
We are going to see more of that where people say, actually, how can I... It's not going... We want everything to go as fast as possible, but we also want to say, yes, you can go as fast as possible, but what is the cost of going that fast, right?
To give an example, if you can run 1,500 tokens per second, like, you know, so like some of our competitors are super, super fast, but...
the amount of chips they need to do that is phenomenal.
Right.
And I think that's where we're, we're seeing a lot of like, you know, you see these top nine numbers and I'm like, okay, but look at the, you know, tokens per kilowatt is a key metric.
Like how many tokens do you get per kilowatt of energy that is being used to power those ships?
Yeah.
Yeah, no, I think again, like, I think some of the, the reason why, you know, agentic people always ask, why is, why are you, why are you connecting agents and power?
And I'm like, agents are using more tokens.
So like, yes, inference, but they, they're just scaling up a, the number of models and be the number of things that you want to do with them.