Noam Shazeer
👤 PersonAppearances Over Time
Podcast Appearances
So it's not like the company was going to do this one little thing and stay doing that.
And also, you could see that what we were doing initially was in that direction, but you could do so much more in that direction.
I mean, I think of it as actually changing quite a bit in the last couple of decades.
So like the two decades ago to one decade ago, it was awesome because you just like wait and like 18 months later, you get much faster hardware and you don't have to do anything.
And then more recently,
You know, I feel like the general purpose CPU-based machines scaling has not been as good.
Like the fabrication processes improvements are now taking three years instead of every two years.
The architectural improvements in, you know, multi-core processors and so on are, you know, not giving you the same boost that we were getting, you know, 20 to 10 years ago.
But I think at the same time, we're seeing improvements
much more specialized computational devices like machine learning accelerators, TPUs, very ML-focused GPUs more recently, are making it so that we can actually get really high performance and good efficiency out of the more modern kinds of computations we want to run that are different than a twisty pile of C++ code trying to run Microsoft Office or something.
Well, I would say that the pivot to hardware oriented around that was an important transition.
Because before that, we had CPUs and GPUs that were not, you know, especially well suited for deep learning.
And then, you know, we started to build, say, TPUs at Google that were really just reduced precision linear algebra machines.
And then once you have that, then you want to... Right.
I know, by the way, the arithmetic can be like really low precision, so then you can squeeze even more multiplier units in.
You'd have a lot more lookups into very large memories.
Yeah.
I mean, I think one thing, one general trend is we're getting better at quantizing or having much more reduced precision models.
You know, we started with TPU v1.
We weren't even quite sure we could quantize a model for serving with 8-bit integers.