Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Nathan Lambert

๐Ÿ‘ค Speaker
1665 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And then they said, no, no, no, no, we're going to remove the interconnect bandwidth and just make it a very simple only flops. But now NVIDIA can now make a chip that has, okay, it's cut down on flops. It's like one third that of the H100 on spec sheet paper performance for flops. In real world, it's closer to like half or maybe even like 60% of it.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And then they said, no, no, no, no, we're going to remove the interconnect bandwidth and just make it a very simple only flops. But now NVIDIA can now make a chip that has, okay, it's cut down on flops. It's like one third that of the H100 on spec sheet paper performance for flops. In real world, it's closer to like half or maybe even like 60% of it.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

But then on the other two vectors, it's just as good for interconnect bandwidth. And then for memory bandwidth and memory capacity, the H20 has more memory bandwidth and and more memory capacity than the H100, right? Now, recently, you know, we at our research, we cut NVIDIA's production for H20 for this year down drastically.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

But then on the other two vectors, it's just as good for interconnect bandwidth. And then for memory bandwidth and memory capacity, the H20 has more memory bandwidth and and more memory capacity than the H100, right? Now, recently, you know, we at our research, we cut NVIDIA's production for H20 for this year down drastically.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

But then on the other two vectors, it's just as good for interconnect bandwidth. And then for memory bandwidth and memory capacity, the H20 has more memory bandwidth and and more memory capacity than the H100, right? Now, recently, you know, we at our research, we cut NVIDIA's production for H20 for this year down drastically.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They were going to make another 2 million of those this year, but they just canceled all the orders a couple of weeks ago. In our view, that's because we think that they think they're going to get restricted. Because why would they cancel all these orders for H20? Because they shipped a million of them last year.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They were going to make another 2 million of those this year, but they just canceled all the orders a couple of weeks ago. In our view, that's because we think that they think they're going to get restricted. Because why would they cancel all these orders for H20? Because they shipped a million of them last year.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They were going to make another 2 million of those this year, but they just canceled all the orders a couple of weeks ago. In our view, that's because we think that they think they're going to get restricted. Because why would they cancel all these orders for H20? Because they shipped a million of them last year.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They had orders in for a couple million this year and just gone for H20, B20, a successor to H20. And now they're all gone. Now, why would they do this? I think it's very clear. The H20 is actually better for certain tasks. And that certain task is reasoning. right? Reasoning is incredibly different than... When you look at the different regimes of models, right?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They had orders in for a couple million this year and just gone for H20, B20, a successor to H20. And now they're all gone. Now, why would they do this? I think it's very clear. The H20 is actually better for certain tasks. And that certain task is reasoning. right? Reasoning is incredibly different than... When you look at the different regimes of models, right?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They had orders in for a couple million this year and just gone for H20, B20, a successor to H20. And now they're all gone. Now, why would they do this? I think it's very clear. The H20 is actually better for certain tasks. And that certain task is reasoning. right? Reasoning is incredibly different than... When you look at the different regimes of models, right?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Pre-training is all about flops, right? It's all about flops. There's things you do, like mixture of experts that we talked about, to trade off interconnect... Or to trade off other aspects and lower the flops and rely more on interconnect and memory. But at the end of the day, it's flops is everything, right? We talk about models in terms of how many flops they are, right?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Pre-training is all about flops, right? It's all about flops. There's things you do, like mixture of experts that we talked about, to trade off interconnect... Or to trade off other aspects and lower the flops and rely more on interconnect and memory. But at the end of the day, it's flops is everything, right? We talk about models in terms of how many flops they are, right?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Pre-training is all about flops, right? It's all about flops. There's things you do, like mixture of experts that we talked about, to trade off interconnect... Or to trade off other aspects and lower the flops and rely more on interconnect and memory. But at the end of the day, it's flops is everything, right? We talk about models in terms of how many flops they are, right?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So, like, you know, we talk about, oh, GPT-4 is 2E25, right? 2 to the 25th, you know, 25 zeros, right? Flop, right? Floating point operations. For training. For training, right? And we're talking about the restrictions for the 2E24, right?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So, like, you know, we talk about, oh, GPT-4 is 2E25, right? 2 to the 25th, you know, 25 zeros, right? Flop, right? Floating point operations. For training. For training, right? And we're talking about the restrictions for the 2E24, right?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So, like, you know, we talk about, oh, GPT-4 is 2E25, right? 2 to the 25th, you know, 25 zeros, right? Flop, right? Floating point operations. For training. For training, right? And we're talking about the restrictions for the 2E24, right?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The US has an executive order that Trump recently unsigned, which was, hey, 1E26, once you hit that number of floating point operations, you must notify the government, And you must share your results with us, right? There's a level of model where the US government must be told, right? And that's 1E26.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The US has an executive order that Trump recently unsigned, which was, hey, 1E26, once you hit that number of floating point operations, you must notify the government, And you must share your results with us, right? There's a level of model where the US government must be told, right? And that's 1E26.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The US has an executive order that Trump recently unsigned, which was, hey, 1E26, once you hit that number of floating point operations, you must notify the government, And you must share your results with us, right? There's a level of model where the US government must be told, right? And that's 1E26.