Chris Lattner
๐ค SpeakerAppearances Over Time
Podcast Appearances
What is the speed of light?
How fast can this thing go?
And then how do I express that?
And so it wasn't anchored relatively on make Python a little bit faster.
It's saying, cool, I know what the hardware can do.
Let's unlock that.
I mean, maybe I'm a special kind of nerd, but you look at that, what is the limit of physics?
How fast can these things go, right?
When you start looking at that, typically it ends up being a memory problem.
And so today, particularly with these specialized accelerators, the problem is that you can do a lot of math within them, but you get bottleneck sending data back and forth to memory, whether it be local memory or distant memory or disk or whatever it is.
And that bottleneck, particularly as the training sizes get large, as you start doing tons of inferences all over the place, that becomes a huge bottleneck for people.
So again, what happened is we went through a phase of many years where people took the special case and hand tuned it and tweaked it and tricked it out and they knew exactly how the hardware worked and they knew the model and they made it fast.
didn't generalize.
And so you can make, you know, ResNet 50 or some, or AlexNet or something, Inception V1, like you can do that, right?
Because the models are small, they fit in your head, right?
But as the models get bigger, more complicated, as the machines get more complicated, it stops working, right?
And so this is where things like kernel fusion come in.
So what is kernel fusion?
This is this idea of saying, let's avoid going to memory.
And let's do that by building a new hybrid