Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Nathan Lambert

๐Ÿ‘ค Speaker
1665 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And those are the bulk of what's being built. But the scale of... And so that's like what's really reshaping and that's what's getting millions of GPUs. But the scale of the largest cluster is also really important, right? When we look back at history, right? Like

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And those are the bulk of what's being built. But the scale of... And so that's like what's really reshaping and that's what's getting millions of GPUs. But the scale of the largest cluster is also really important, right? When we look back at history, right? Like

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And those are the bulk of what's being built. But the scale of... And so that's like what's really reshaping and that's what's getting millions of GPUs. But the scale of the largest cluster is also really important, right? When we look back at history, right? Like

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

you know or through through the age of ai right like it was a really big deal when they did alex net on i think two gpus or four gpus i don't remember it's a really big deal it's a big deal because you use gpus it's a big deal they use gpus um and they use multiple right but then over time its scale has just been compounding right and so when you skip forward to gpt3 then gpt4 gpt4 20 000

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

you know or through through the age of ai right like it was a really big deal when they did alex net on i think two gpus or four gpus i don't remember it's a really big deal it's a big deal because you use gpus it's a big deal they use gpus um and they use multiple right but then over time its scale has just been compounding right and so when you skip forward to gpt3 then gpt4 gpt4 20 000

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

you know or through through the age of ai right like it was a really big deal when they did alex net on i think two gpus or four gpus i don't remember it's a really big deal it's a big deal because you use gpus it's a big deal they use gpus um and they use multiple right but then over time its scale has just been compounding right and so when you skip forward to gpt3 then gpt4 gpt4 20 000

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

a 100 gpus unprecedented run right in terms of the size and the cost right a couple hundred million dollars on a yolo right a yolo run for gpd4 and it and it yielded you know this magical improvement that was like perfectly in line with what was experimented and just like a log scale right oh yeah they have that plot from the paper the scaling the technical performance

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

a 100 gpus unprecedented run right in terms of the size and the cost right a couple hundred million dollars on a yolo right a yolo run for gpd4 and it and it yielded you know this magical improvement that was like perfectly in line with what was experimented and just like a log scale right oh yeah they have that plot from the paper the scaling the technical performance

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

a 100 gpus unprecedented run right in terms of the size and the cost right a couple hundred million dollars on a yolo right a yolo run for gpd4 and it and it yielded you know this magical improvement that was like perfectly in line with what was experimented and just like a log scale right oh yeah they have that plot from the paper the scaling the technical performance

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The scaling laws were perfect, right? But that's not a crazy number, right? 20,000 A100s, roughly each GPU is consuming 400 watts. And then when you add in the whole server, right, everything, it's like 15 to 20 megawatts of power, right? You know, maybe you could look up what the power of consumption of a human person is because the numbers are going to get silly.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The scaling laws were perfect, right? But that's not a crazy number, right? 20,000 A100s, roughly each GPU is consuming 400 watts. And then when you add in the whole server, right, everything, it's like 15 to 20 megawatts of power, right? You know, maybe you could look up what the power of consumption of a human person is because the numbers are going to get silly.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The scaling laws were perfect, right? But that's not a crazy number, right? 20,000 A100s, roughly each GPU is consuming 400 watts. And then when you add in the whole server, right, everything, it's like 15 to 20 megawatts of power, right? You know, maybe you could look up what the power of consumption of a human person is because the numbers are going to get silly.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

But like 15 to 20 megawatts was standard data center size. It was just unprecedented. That was all GPUs running one task. How many watts was a toaster? A toaster is like a similar power consumption to an A100, right? H100 comes around, they increase the power from like 400 to 700 watts, and that's just per GPU, and then there's all the associated stuff around it.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

But like 15 to 20 megawatts was standard data center size. It was just unprecedented. That was all GPUs running one task. How many watts was a toaster? A toaster is like a similar power consumption to an A100, right? H100 comes around, they increase the power from like 400 to 700 watts, and that's just per GPU, and then there's all the associated stuff around it.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

But like 15 to 20 megawatts was standard data center size. It was just unprecedented. That was all GPUs running one task. How many watts was a toaster? A toaster is like a similar power consumption to an A100, right? H100 comes around, they increase the power from like 400 to 700 watts, and that's just per GPU, and then there's all the associated stuff around it.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So once you count all that, it's roughly like 1200 to 1400 watts for everything, networking, CPUs, memory, blah, blah, blah.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So once you count all that, it's roughly like 1200 to 1400 watts for everything, networking, CPUs, memory, blah, blah, blah.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So once you count all that, it's roughly like 1200 to 1400 watts for everything, networking, CPUs, memory, blah, blah, blah.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so I think, yeah, sorry for skipping past that. And then the data center itself is complicated, right? But these are still standardized data centers for GPT-4 scale, right? Now we step forward to sort of what is the scale of clusters that people built last year? And it ranges widely.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so I think, yeah, sorry for skipping past that. And then the data center itself is complicated, right? But these are still standardized data centers for GPT-4 scale, right? Now we step forward to sort of what is the scale of clusters that people built last year? And it ranges widely.