Nathan Lambert

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

To give context, right? Everyone, one of the parts of like freaking this out was like trying to reach the capabilities. The other aspect is they did it so cheap, right? And the so cheap, we kind of talked about on the training side, why it was so cheap.

8445.667 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

To give context, right? Everyone, one of the parts of like freaking this out was like trying to reach the capabilities. The other aspect is they did it so cheap, right? And the so cheap, we kind of talked about on the training side, why it was so cheap.

8445.667 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

To give context, right? Everyone, one of the parts of like freaking this out was like trying to reach the capabilities. The other aspect is they did it so cheap, right? And the so cheap, we kind of talked about on the training side, why it was so cheap.

8445.667 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So I think there's a couple factors here, right? One is that they do have model architecture innovations, right? This MLA, this new attention that they've done is different than the attention from attention is all you need to transform our attention, right? Now, others have already innovated.

8462.958 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So I think there's a couple factors here, right? One is that they do have model architecture innovations, right? This MLA, this new attention that they've done is different than the attention from attention is all you need to transform our attention, right? Now, others have already innovated.

8462.958 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So I think there's a couple factors here, right? One is that they do have model architecture innovations, right? This MLA, this new attention that they've done is different than the attention from attention is all you need to transform our attention, right? Now, others have already innovated.

8462.958 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

There's a lot of work like MQA, GQA, local, global, all these different innovations that like try to bend the curve, right? It's still quadratic, but the constant is now smaller, right?

8478.93 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

There's a lot of work like MQA, GQA, local, global, all these different innovations that like try to bend the curve, right? It's still quadratic, but the constant is now smaller, right?

8478.93 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

There's a lot of work like MQA, GQA, local, global, all these different innovations that like try to bend the curve, right? It's still quadratic, but the constant is now smaller, right?

8478.93 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's 80% to 90% versus the original, but then versus what people are actually doing. It's still an innovation.

8499.029 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's 80% to 90% versus the original, but then versus what people are actually doing. It's still an innovation.

8499.029 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's 80% to 90% versus the original, but then versus what people are actually doing. It's still an innovation.

8499.029 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Well, and not just that, right? Like other people have implemented techniques like local-global and sliding window and GQMQA. But anyways, like DeepSeq has their attention mechanism as a true architectural innovation. They did tons of experimentation. And this dramatically reduces the memory pressure. It's still there, right? It's still attention. It's still quadratic.

8509.121 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Well, and not just that, right? Like other people have implemented techniques like local-global and sliding window and GQMQA. But anyways, like DeepSeq has their attention mechanism as a true architectural innovation. They did tons of experimentation. And this dramatically reduces the memory pressure. It's still there, right? It's still attention. It's still quadratic.

8509.121 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Well, and not just that, right? Like other people have implemented techniques like local-global and sliding window and GQMQA. But anyways, like DeepSeq has their attention mechanism as a true architectural innovation. They did tons of experimentation. And this dramatically reduces the memory pressure. It's still there, right? It's still attention. It's still quadratic.

8509.121 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's just dramatically reduced it relative to prior forms.

8527.534 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's just dramatically reduced it relative to prior forms.

8527.534 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's just dramatically reduced it relative to prior forms.

8527.534 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So I think this is very important, right? OpenAI is, you know, that drastic gap between DeepSeek and pricing. But DeepSeek is offering the same model because they open-weighted it to everyone else for a very similar, like much lower price than what others are able to serve it for.

8557.6 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So I think this is very important, right? OpenAI is, you know, that drastic gap between DeepSeek and pricing. But DeepSeek is offering the same model because they open-weighted it to everyone else for a very similar, like much lower price than what others are able to serve it for.

8557.6 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment