George Hotz
👤 PersonAppearances Over Time
Podcast Appearances
I think that there is a spectrum and like on one side you have Mojo and on the other side you have like GGML. GGML is this like we're going to run Llama fast on Mac. And okay, we're going to expand out to a little bit, but we're going to basically go like depth first, right? Mojo is like, we're going to go breadth first. We're going to go so wide that we're going to make all of Python fast.
I think that there is a spectrum and like on one side you have Mojo and on the other side you have like GGML. GGML is this like we're going to run Llama fast on Mac. And okay, we're going to expand out to a little bit, but we're going to basically go like depth first, right? Mojo is like, we're going to go breadth first. We're going to go so wide that we're going to make all of Python fast.
I think that there is a spectrum and like on one side you have Mojo and on the other side you have like GGML. GGML is this like we're going to run Llama fast on Mac. And okay, we're going to expand out to a little bit, but we're going to basically go like depth first, right? Mojo is like, we're going to go breadth first. We're going to go so wide that we're going to make all of Python fast.
And TinyGrad's in the middle. TinyGrad is, we are going to make neural networks fast.
And TinyGrad's in the middle. TinyGrad is, we are going to make neural networks fast.
And TinyGrad's in the middle. TinyGrad is, we are going to make neural networks fast.
Yeah, but they have turn completeness.
Yeah, but they have turn completeness.
Yeah, but they have turn completeness.
My goal is step one, build an equally performance stack to PyTorch on NVIDIA and AMD, but with way less lines. And then step two is, okay, how do we make an accelerator, right? But you need step one. You have to first build the framework before you can build the accelerator.
My goal is step one, build an equally performance stack to PyTorch on NVIDIA and AMD, but with way less lines. And then step two is, okay, how do we make an accelerator, right? But you need step one. You have to first build the framework before you can build the accelerator.
My goal is step one, build an equally performance stack to PyTorch on NVIDIA and AMD, but with way less lines. And then step two is, okay, how do we make an accelerator, right? But you need step one. You have to first build the framework before you can build the accelerator.
So I'm much more of a, like, build it the right way and worry about performance later. There's a bunch of things where I haven't even, like, really dove into performance. The only place where TinyGrad is competitive performance-wise right now is on Qualcomm GPUs. So TinyGrad's actually used an open pilot to run the model. So the driving model is TinyGrad. When did that happen, that transition?
So I'm much more of a, like, build it the right way and worry about performance later. There's a bunch of things where I haven't even, like, really dove into performance. The only place where TinyGrad is competitive performance-wise right now is on Qualcomm GPUs. So TinyGrad's actually used an open pilot to run the model. So the driving model is TinyGrad. When did that happen, that transition?
So I'm much more of a, like, build it the right way and worry about performance later. There's a bunch of things where I haven't even, like, really dove into performance. The only place where TinyGrad is competitive performance-wise right now is on Qualcomm GPUs. So TinyGrad's actually used an open pilot to run the model. So the driving model is TinyGrad. When did that happen, that transition?
About eight months ago now. And it's 2x faster than Qualcomm's library.
About eight months ago now. And it's 2x faster than Qualcomm's library.
About eight months ago now. And it's 2x faster than Qualcomm's library.
It's a Snapdragon 845. Okay. So this is using the GPU. So the GPU is an Adreno GPU. There's like different things. There's a really good Microsoft paper that talks about like mobile GPUs and why they're different from desktop GPUs. One of the big things is in a desktop GPU, you can use buffers. On a mobile GPU, image textures are a lot faster.
It's a Snapdragon 845. Okay. So this is using the GPU. So the GPU is an Adreno GPU. There's like different things. There's a really good Microsoft paper that talks about like mobile GPUs and why they're different from desktop GPUs. One of the big things is in a desktop GPU, you can use buffers. On a mobile GPU, image textures are a lot faster.