Nathan Lambert

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And those are the bulk of what's being built. But the scale of... And so that's like what's really reshaping and that's what's getting millions of GPUs. But the scale of the largest cluster is also really important, right? When we look back at history, right? Like

13739.182 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And those are the bulk of what's being built. But the scale of... And so that's like what's really reshaping and that's what's getting millions of GPUs. But the scale of the largest cluster is also really important, right? When we look back at history, right? Like

13739.182 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And those are the bulk of what's being built. But the scale of... And so that's like what's really reshaping and that's what's getting millions of GPUs. But the scale of the largest cluster is also really important, right? When we look back at history, right? Like

13739.182 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

you know or through through the age of ai right like it was a really big deal when they did alex net on i think two gpus or four gpus i don't remember it's a really big deal it's a big deal because you use gpus it's a big deal they use gpus um and they use multiple right but then over time its scale has just been compounding right and so when you skip forward to gpt3 then gpt4 gpt4 20 000

13754.185 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

you know or through through the age of ai right like it was a really big deal when they did alex net on i think two gpus or four gpus i don't remember it's a really big deal it's a big deal because you use gpus it's a big deal they use gpus um and they use multiple right but then over time its scale has just been compounding right and so when you skip forward to gpt3 then gpt4 gpt4 20 000

13754.185 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

you know or through through the age of ai right like it was a really big deal when they did alex net on i think two gpus or four gpus i don't remember it's a really big deal it's a big deal because you use gpus it's a big deal they use gpus um and they use multiple right but then over time its scale has just been compounding right and so when you skip forward to gpt3 then gpt4 gpt4 20 000

13754.185 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

a 100 gpus unprecedented run right in terms of the size and the cost right a couple hundred million dollars on a yolo right a yolo run for gpd4 and it and it yielded you know this magical improvement that was like perfectly in line with what was experimented and just like a log scale right oh yeah they have that plot from the paper the scaling the technical performance

13777.945 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

a 100 gpus unprecedented run right in terms of the size and the cost right a couple hundred million dollars on a yolo right a yolo run for gpd4 and it and it yielded you know this magical improvement that was like perfectly in line with what was experimented and just like a log scale right oh yeah they have that plot from the paper the scaling the technical performance

13777.945 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

a 100 gpus unprecedented run right in terms of the size and the cost right a couple hundred million dollars on a yolo right a yolo run for gpd4 and it and it yielded you know this magical improvement that was like perfectly in line with what was experimented and just like a log scale right oh yeah they have that plot from the paper the scaling the technical performance

13777.945 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The scaling laws were perfect, right? But that's not a crazy number, right? 20,000 A100s, roughly each GPU is consuming 400 watts. And then when you add in the whole server, right, everything, it's like 15 to 20 megawatts of power, right? You know, maybe you could look up what the power of consumption of a human person is because the numbers are going to get silly.

13796.866 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The scaling laws were perfect, right? But that's not a crazy number, right? 20,000 A100s, roughly each GPU is consuming 400 watts. And then when you add in the whole server, right, everything, it's like 15 to 20 megawatts of power, right? You know, maybe you could look up what the power of consumption of a human person is because the numbers are going to get silly.

13796.866 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

The scaling laws were perfect, right? But that's not a crazy number, right? 20,000 A100s, roughly each GPU is consuming 400 watts. And then when you add in the whole server, right, everything, it's like 15 to 20 megawatts of power, right? You know, maybe you could look up what the power of consumption of a human person is because the numbers are going to get silly.

13796.866 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

But like 15 to 20 megawatts was standard data center size. It was just unprecedented. That was all GPUs running one task. How many watts was a toaster? A toaster is like a similar power consumption to an A100, right? H100 comes around, they increase the power from like 400 to 700 watts, and that's just per GPU, and then there's all the associated stuff around it.

13817.373 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

But like 15 to 20 megawatts was standard data center size. It was just unprecedented. That was all GPUs running one task. How many watts was a toaster? A toaster is like a similar power consumption to an A100, right? H100 comes around, they increase the power from like 400 to 700 watts, and that's just per GPU, and then there's all the associated stuff around it.

13817.373 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

But like 15 to 20 megawatts was standard data center size. It was just unprecedented. That was all GPUs running one task. How many watts was a toaster? A toaster is like a similar power consumption to an A100, right? H100 comes around, they increase the power from like 400 to 700 watts, and that's just per GPU, and then there's all the associated stuff around it.

13817.373 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So once you count all that, it's roughly like 1200 to 1400 watts for everything, networking, CPUs, memory, blah, blah, blah.

13836.439 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So once you count all that, it's roughly like 1200 to 1400 watts for everything, networking, CPUs, memory, blah, blah, blah.

13836.439 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So once you count all that, it's roughly like 1200 to 1400 watts for everything, networking, CPUs, memory, blah, blah, blah.

13836.439 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so I think, yeah, sorry for skipping past that. And then the data center itself is complicated, right? But these are still standardized data centers for GPT-4 scale, right? Now we step forward to sort of what is the scale of clusters that people built last year? And it ranges widely.

13860.793 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so I think, yeah, sorry for skipping past that. And then the data center itself is complicated, right? But these are still standardized data centers for GPT-4 scale, right? Now we step forward to sort of what is the scale of clusters that people built last year? And it ranges widely.

13860.793 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment