George Hotz
👤 PersonAppearances Over Time
Podcast Appearances
So we have Nvidia and AMD. Great.
So we have Nvidia and AMD. Great.
Have you seen it? Google loves to rent you TPUs.
Have you seen it? Google loves to rent you TPUs.
Have you seen it? Google loves to rent you TPUs.
So I started work on a, uh, I was like, okay, what's it going to take to make a chip? And my first notions were all completely wrong about why, about like how you could improve on GPUs. And I will take this, this is from Jim Keller on your podcast. And this is one of my absolute favorite descriptions of computation.
So I started work on a, uh, I was like, okay, what's it going to take to make a chip? And my first notions were all completely wrong about why, about like how you could improve on GPUs. And I will take this, this is from Jim Keller on your podcast. And this is one of my absolute favorite descriptions of computation.
So I started work on a, uh, I was like, okay, what's it going to take to make a chip? And my first notions were all completely wrong about why, about like how you could improve on GPUs. And I will take this, this is from Jim Keller on your podcast. And this is one of my absolute favorite descriptions of computation.
So there's three kinds of computation paradigms that are common in the world today. There's CPUs, and CPUs can do everything. CPUs can do add and multiply, they can do load and store, and they can do compare and branch. And when I say they can do these things, they can do them all fast, right?
So there's three kinds of computation paradigms that are common in the world today. There's CPUs, and CPUs can do everything. CPUs can do add and multiply, they can do load and store, and they can do compare and branch. And when I say they can do these things, they can do them all fast, right?
So there's three kinds of computation paradigms that are common in the world today. There's CPUs, and CPUs can do everything. CPUs can do add and multiply, they can do load and store, and they can do compare and branch. And when I say they can do these things, they can do them all fast, right?
So compare and branch are unique to CPUs, and what I mean by they can do them fast is they can do things like branch prediction and speculative execution, and they spend tons of transistors on these super deep reorder buffers in order to make these things fast. Then you have a simpler computation model, GPUs. GPUs can't really do compare and branch. I mean, they can, but it's horrendously slow.
So compare and branch are unique to CPUs, and what I mean by they can do them fast is they can do things like branch prediction and speculative execution, and they spend tons of transistors on these super deep reorder buffers in order to make these things fast. Then you have a simpler computation model, GPUs. GPUs can't really do compare and branch. I mean, they can, but it's horrendously slow.
So compare and branch are unique to CPUs, and what I mean by they can do them fast is they can do things like branch prediction and speculative execution, and they spend tons of transistors on these super deep reorder buffers in order to make these things fast. Then you have a simpler computation model, GPUs. GPUs can't really do compare and branch. I mean, they can, but it's horrendously slow.
But GPUs can do arbitrary load and store. GPUs can do things like X, dereference Y. So they can fetch from arbitrary pieces of memory. They can fetch from memory that is defined by the contents of the data. The third model of computation is DSPs. And DSPs are just add and multiply. They can do load and stores, but only static load and stores.
But GPUs can do arbitrary load and store. GPUs can do things like X, dereference Y. So they can fetch from arbitrary pieces of memory. They can fetch from memory that is defined by the contents of the data. The third model of computation is DSPs. And DSPs are just add and multiply. They can do load and stores, but only static load and stores.
But GPUs can do arbitrary load and store. GPUs can do things like X, dereference Y. So they can fetch from arbitrary pieces of memory. They can fetch from memory that is defined by the contents of the data. The third model of computation is DSPs. And DSPs are just add and multiply. They can do load and stores, but only static load and stores.
Only loads and stores that are known before the program runs. And you look at neural networks today, and 95% of neural networks are all the DSP paradigm. They are just statically scheduled adds and multiplies. So TinyGuard really took this idea, and I'm still working on it, to extend this as far as possible. Every stage of the stack has Turing completeness.
Only loads and stores that are known before the program runs. And you look at neural networks today, and 95% of neural networks are all the DSP paradigm. They are just statically scheduled adds and multiplies. So TinyGuard really took this idea, and I'm still working on it, to extend this as far as possible. Every stage of the stack has Turing completeness.
Only loads and stores that are known before the program runs. And you look at neural networks today, and 95% of neural networks are all the DSP paradigm. They are just statically scheduled adds and multiplies. So TinyGuard really took this idea, and I'm still working on it, to extend this as far as possible. Every stage of the stack has Turing completeness.