Kieran Kunhya
π€ SpeakerAppearances Over Time
Podcast Appearances
And like, it's not even close, right?
It's not even close, right?
It's not like 5%, 10% slower.
It's multiple times slower.
Philosophically, what's important to realize is that we passed the time where hardware was going so much faster, right?
We had the end of the Moore law.
We have limitation for AI, for memory.
You need to go down in the stack and optimize more to get more power from what you have because our requests for power, CPU power, GPU power are exploding while the hardware is not exploding in time.
speed, right?
So what people do is that they add more calls, right?
But that's basically like at some point you can have 250 calls, right?
So what we do is to take every inch of the machine.
And one of the other things that we do, for example, in David, which is a bit crazy, is that we don't use the function calling convention from the operating system.
That is extremely complex.
But basically, usually when you do move from one function in code to another, there is a way to save the registry, the state of the CPU to enter another function.
And this is like standard.
Yes, and in all that, we don't even respect the calling convention of the operating system in order to be faster, because we know that we are going to be called from within our binary.
So we can share data without saving all the registry in the common way, because that can lead to loading and saving registry on the L1 and L2 CPU and gets us faster.
So that's why I said that understanding
CPU architecture, computer architecture is key.