Dylan Patel

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I almost think it's practically impossible. Because you effectively have to remove them from the internet.

9852.404 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I almost think it's practically impossible. Because you effectively have to remove them from the internet.

9852.404 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It gets filtered out. So you have quality filters, which are small language models that look at a document and tell you how good is this text? Is it close to a Wikipedia article, which is a good... that we want language models to be able to imitate.

9864.146 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It gets filtered out. So you have quality filters, which are small language models that look at a document and tell you how good is this text? Is it close to a Wikipedia article, which is a good... that we want language models to be able to imitate.

9864.146 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It gets filtered out. So you have quality filters, which are small language models that look at a document and tell you how good is this text? Is it close to a Wikipedia article, which is a good... that we want language models to be able to imitate.

9864.146 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yes, but is it going to catch wordplay or encoded language?

9881.609 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yes, but is it going to catch wordplay or encoded language?

9881.609 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yes, but is it going to catch wordplay or encoded language?

9881.609 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It'll have the ability to express it.

9965.61 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It'll have the ability to express it.

9965.61 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It'll have the ability to express it.

9965.61 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

This is what happens. A lot of what is called post-training is a series of techniques to get the model... on rails of a really specific behavior.

9972.797 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

This is what happens. A lot of what is called post-training is a series of techniques to get the model... on rails of a really specific behavior.

9972.797 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

This is what happens. A lot of what is called post-training is a series of techniques to get the model... on rails of a really specific behavior.

9972.797 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And there's a lot of history here. So we can go through multiple examples and what happened. Llama 2 was a launch that the phrase like too much RLHF or like too much safety was a lot. It's just, that was the whole narrative after Lama2's chat models released. And the examples are sorts of things like you would ask Lama2 chat, how do you kill a Python process?

10016.989 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And there's a lot of history here. So we can go through multiple examples and what happened. Llama 2 was a launch that the phrase like too much RLHF or like too much safety was a lot. It's just, that was the whole narrative after Lama2's chat models released. And the examples are sorts of things like you would ask Lama2 chat, how do you kill a Python process?

10016.989 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And there's a lot of history here. So we can go through multiple examples and what happened. Llama 2 was a launch that the phrase like too much RLHF or like too much safety was a lot. It's just, that was the whole narrative after Lama2's chat models released. And the examples are sorts of things like you would ask Lama2 chat, how do you kill a Python process?

10016.989 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And it would say, I can't talk about killing because that's a bad thing. And anyone that is trying to design an AI model will probably agree that that's just like, eh, model, you messed up a bit on the training there. I don't think they meant to do this, but this was in the model weight. So this is not, you know,

10037.877 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And it would say, I can't talk about killing because that's a bad thing. And anyone that is trying to design an AI model will probably agree that that's just like, eh, model, you messed up a bit on the training there. I don't think they meant to do this, but this was in the model weight. So this is not, you know,

10037.877 View full episode →

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And it would say, I can't talk about killing because that's a bad thing. And anyone that is trying to design an AI model will probably agree that that's just like, eh, model, you messed up a bit on the training there. I don't think they meant to do this, but this was in the model weight. So this is not, you know,

10037.877 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment