Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Nathan Lambert

๐Ÿ‘ค Speaker
1665 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And so this is sort of like the quote-unquote visionary behind the company, right? This hedge fund still exists, right? This quantitative firm. And so... DeepSeek is the sort of, you know, slowly he got turned to this full view of like AI, everything about this, right? But at some point it slowly maneuvered and he made DeepSeek. And DeepSeek has done multiple models since then.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And so this is sort of like the quote-unquote visionary behind the company, right? This hedge fund still exists, right? This quantitative firm. And so... DeepSeek is the sort of, you know, slowly he got turned to this full view of like AI, everything about this, right? But at some point it slowly maneuvered and he made DeepSeek. And DeepSeek has done multiple models since then.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They've acquired more and more GPUs. They share infrastructure with the fund. Right. And so, you know, there is no exact number of public GPU resources that they have. But besides this 10,000 GPUs that they bought in 2021. Right. And they were fantastically profitable. Right.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They've acquired more and more GPUs. They share infrastructure with the fund. Right. And so, you know, there is no exact number of public GPU resources that they have. But besides this 10,000 GPUs that they bought in 2021. Right. And they were fantastically profitable. Right.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They've acquired more and more GPUs. They share infrastructure with the fund. Right. And so, you know, there is no exact number of public GPU resources that they have. But besides this 10,000 GPUs that they bought in 2021. Right. And they were fantastically profitable. Right.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And then this paper claims they did only 2,000 H800 GPUs, which are a restricted GPU that was previously allowed in China, but no longer allowed. And there's a new version. But it's basically NVIDIA's H100 for China.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And then this paper claims they did only 2,000 H800 GPUs, which are a restricted GPU that was previously allowed in China, but no longer allowed. And there's a new version. But it's basically NVIDIA's H100 for China.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And then this paper claims they did only 2,000 H800 GPUs, which are a restricted GPU that was previously allowed in China, but no longer allowed. And there's a new version. But it's basically NVIDIA's H100 for China.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

right um and there's some restrictions on it specifically around the communications uh sort of uh speed the interconnect speed right which is why they had to do this crazy sm you know scheduling stuff right so going back to that right looks like this is obviously not true in terms of their total gpu count obvious available gpus but for this training run you think 2000 is the correct number or no so this is where it takes um you know a significant amount of sort of like zoning in right like

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

right um and there's some restrictions on it specifically around the communications uh sort of uh speed the interconnect speed right which is why they had to do this crazy sm you know scheduling stuff right so going back to that right looks like this is obviously not true in terms of their total gpu count obvious available gpus but for this training run you think 2000 is the correct number or no so this is where it takes um you know a significant amount of sort of like zoning in right like

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

right um and there's some restrictions on it specifically around the communications uh sort of uh speed the interconnect speed right which is why they had to do this crazy sm you know scheduling stuff right so going back to that right looks like this is obviously not true in terms of their total gpu count obvious available gpus but for this training run you think 2000 is the correct number or no so this is where it takes um you know a significant amount of sort of like zoning in right like

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

What do you call your training run, right? Do you count all of the research and ablations that you ran, right? Picking all this stuff, because yes, you can do a YOLO run, but at some level, you have to do the test at the small scale, and then you have to do some test at medium scale before you go to a large scale.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

What do you call your training run, right? Do you count all of the research and ablations that you ran, right? Picking all this stuff, because yes, you can do a YOLO run, but at some level, you have to do the test at the small scale, and then you have to do some test at medium scale before you go to a large scale.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

What do you call your training run, right? Do you count all of the research and ablations that you ran, right? Picking all this stuff, because yes, you can do a YOLO run, but at some level, you have to do the test at the small scale, and then you have to do some test at medium scale before you go to a large scale.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, and research begets the new ideas that let you get huge efficiency.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, and research begets the new ideas that let you get huge efficiency.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, and research begets the new ideas that let you get huge efficiency.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So the numbers that DeepSeq specifically said publicly, right, are just the 10,000 GPUs in 2021 and then 2,000 GPUs for only the pre-training for V3. They did not discuss cost on R1. They did not discuss cost on all the other RL, right, for the instructive model that they made, right?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So the numbers that DeepSeq specifically said publicly, right, are just the 10,000 GPUs in 2021 and then 2,000 GPUs for only the pre-training for V3. They did not discuss cost on R1. They did not discuss cost on all the other RL, right, for the instructive model that they made, right?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So the numbers that DeepSeq specifically said publicly, right, are just the 10,000 GPUs in 2021 and then 2,000 GPUs for only the pre-training for V3. They did not discuss cost on R1. They did not discuss cost on all the other RL, right, for the instructive model that they made, right?