Nathan Lambert
๐ค SpeakerAppearances Over Time
Podcast Appearances
And then they said, no, no, no, no, we're going to remove the interconnect bandwidth and just make it a very simple only flops. But now NVIDIA can now make a chip that has, okay, it's cut down on flops. It's like one third that of the H100 on spec sheet paper performance for flops. In real world, it's closer to like half or maybe even like 60% of it.
And then they said, no, no, no, no, we're going to remove the interconnect bandwidth and just make it a very simple only flops. But now NVIDIA can now make a chip that has, okay, it's cut down on flops. It's like one third that of the H100 on spec sheet paper performance for flops. In real world, it's closer to like half or maybe even like 60% of it.
But then on the other two vectors, it's just as good for interconnect bandwidth. And then for memory bandwidth and memory capacity, the H20 has more memory bandwidth and and more memory capacity than the H100, right? Now, recently, you know, we at our research, we cut NVIDIA's production for H20 for this year down drastically.
But then on the other two vectors, it's just as good for interconnect bandwidth. And then for memory bandwidth and memory capacity, the H20 has more memory bandwidth and and more memory capacity than the H100, right? Now, recently, you know, we at our research, we cut NVIDIA's production for H20 for this year down drastically.
But then on the other two vectors, it's just as good for interconnect bandwidth. And then for memory bandwidth and memory capacity, the H20 has more memory bandwidth and and more memory capacity than the H100, right? Now, recently, you know, we at our research, we cut NVIDIA's production for H20 for this year down drastically.
They were going to make another 2 million of those this year, but they just canceled all the orders a couple of weeks ago. In our view, that's because we think that they think they're going to get restricted. Because why would they cancel all these orders for H20? Because they shipped a million of them last year.
They were going to make another 2 million of those this year, but they just canceled all the orders a couple of weeks ago. In our view, that's because we think that they think they're going to get restricted. Because why would they cancel all these orders for H20? Because they shipped a million of them last year.
They were going to make another 2 million of those this year, but they just canceled all the orders a couple of weeks ago. In our view, that's because we think that they think they're going to get restricted. Because why would they cancel all these orders for H20? Because they shipped a million of them last year.
They had orders in for a couple million this year and just gone for H20, B20, a successor to H20. And now they're all gone. Now, why would they do this? I think it's very clear. The H20 is actually better for certain tasks. And that certain task is reasoning. right? Reasoning is incredibly different than... When you look at the different regimes of models, right?
They had orders in for a couple million this year and just gone for H20, B20, a successor to H20. And now they're all gone. Now, why would they do this? I think it's very clear. The H20 is actually better for certain tasks. And that certain task is reasoning. right? Reasoning is incredibly different than... When you look at the different regimes of models, right?
They had orders in for a couple million this year and just gone for H20, B20, a successor to H20. And now they're all gone. Now, why would they do this? I think it's very clear. The H20 is actually better for certain tasks. And that certain task is reasoning. right? Reasoning is incredibly different than... When you look at the different regimes of models, right?
Pre-training is all about flops, right? It's all about flops. There's things you do, like mixture of experts that we talked about, to trade off interconnect... Or to trade off other aspects and lower the flops and rely more on interconnect and memory. But at the end of the day, it's flops is everything, right? We talk about models in terms of how many flops they are, right?
Pre-training is all about flops, right? It's all about flops. There's things you do, like mixture of experts that we talked about, to trade off interconnect... Or to trade off other aspects and lower the flops and rely more on interconnect and memory. But at the end of the day, it's flops is everything, right? We talk about models in terms of how many flops they are, right?
Pre-training is all about flops, right? It's all about flops. There's things you do, like mixture of experts that we talked about, to trade off interconnect... Or to trade off other aspects and lower the flops and rely more on interconnect and memory. But at the end of the day, it's flops is everything, right? We talk about models in terms of how many flops they are, right?
So, like, you know, we talk about, oh, GPT-4 is 2E25, right? 2 to the 25th, you know, 25 zeros, right? Flop, right? Floating point operations. For training. For training, right? And we're talking about the restrictions for the 2E24, right?
So, like, you know, we talk about, oh, GPT-4 is 2E25, right? 2 to the 25th, you know, 25 zeros, right? Flop, right? Floating point operations. For training. For training, right? And we're talking about the restrictions for the 2E24, right?
So, like, you know, we talk about, oh, GPT-4 is 2E25, right? 2 to the 25th, you know, 25 zeros, right? Flop, right? Floating point operations. For training. For training, right? And we're talking about the restrictions for the 2E24, right?
The US has an executive order that Trump recently unsigned, which was, hey, 1E26, once you hit that number of floating point operations, you must notify the government, And you must share your results with us, right? There's a level of model where the US government must be told, right? And that's 1E26.
The US has an executive order that Trump recently unsigned, which was, hey, 1E26, once you hit that number of floating point operations, you must notify the government, And you must share your results with us, right? There's a level of model where the US government must be told, right? And that's 1E26.
The US has an executive order that Trump recently unsigned, which was, hey, 1E26, once you hit that number of floating point operations, you must notify the government, And you must share your results with us, right? There's a level of model where the US government must be told, right? And that's 1E26.