Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Nathan Lambert

๐Ÿ‘ค Speaker
1665 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And then the ethical aspect of it is like, why is it unethical for me to train on your model when you can train on the Internet's text?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And then the ethical aspect of it is like, why is it unethical for me to train on your model when you can train on the Internet's text?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And then the ethical aspect of it is like, why is it unethical for me to train on your model when you can train on the Internet's text?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

This is why a lot of models today, even if they train on zero OpenAI data, you ask the model who trained you, it'll say, I am Chad GPT trained by OpenAI. Because there's so much copy paste of like OpenAI outputs from that on the internet that you just weren't able to filter it out. And there was nothing in the URL where they implemented like, hey, like, or post-training or SFT, whatever that says.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

This is why a lot of models today, even if they train on zero OpenAI data, you ask the model who trained you, it'll say, I am Chad GPT trained by OpenAI. Because there's so much copy paste of like OpenAI outputs from that on the internet that you just weren't able to filter it out. And there was nothing in the URL where they implemented like, hey, like, or post-training or SFT, whatever that says.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

This is why a lot of models today, even if they train on zero OpenAI data, you ask the model who trained you, it'll say, I am Chad GPT trained by OpenAI. Because there's so much copy paste of like OpenAI outputs from that on the internet that you just weren't able to filter it out. And there was nothing in the URL where they implemented like, hey, like, or post-training or SFT, whatever that says.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

hey, I'm actually a model by Allen Institute instead of OpenAI.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

hey, I'm actually a model by Allen Institute instead of OpenAI.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

hey, I'm actually a model by Allen Institute instead of OpenAI.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I think everyone has benefited regardless because the data's on the internet. And therefore, it's in your portrayal now. There are subreddits where people share the best chat GPT outputs, and those are in your model.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I think everyone has benefited regardless because the data's on the internet. And therefore, it's in your portrayal now. There are subreddits where people share the best chat GPT outputs, and those are in your model.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I think everyone has benefited regardless because the data's on the internet. And therefore, it's in your portrayal now. There are subreddits where people share the best chat GPT outputs, and those are in your model.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Actually, over the last couple of days, we've seen a lot of people distill DeepSeq's model into Lama models because the DeepSeq models are kind of complicated to run inference on because they're a mixture of experts and they're 600 plus billion parameters and all this. And people distill them into the Lama models because...

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Actually, over the last couple of days, we've seen a lot of people distill DeepSeq's model into Lama models because the DeepSeq models are kind of complicated to run inference on because they're a mixture of experts and they're 600 plus billion parameters and all this. And people distill them into the Lama models because...

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Actually, over the last couple of days, we've seen a lot of people distill DeepSeq's model into Lama models because the DeepSeq models are kind of complicated to run inference on because they're a mixture of experts and they're 600 plus billion parameters and all this. And people distill them into the Lama models because...

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Because the Lama models are so easy to serve and everyone's built the pipelines and tooling for inference with the Lama models, right? Because it's the open standard. So, you know, we've seen it. We've seen a sort of roundabout, right? Like, is it bad? Is it illegal? Maybe it's illegal, whatever. I don't know about that.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Because the Lama models are so easy to serve and everyone's built the pipelines and tooling for inference with the Lama models, right? Because it's the open standard. So, you know, we've seen it. We've seen a sort of roundabout, right? Like, is it bad? Is it illegal? Maybe it's illegal, whatever. I don't know about that.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Because the Lama models are so easy to serve and everyone's built the pipelines and tooling for inference with the Lama models, right? Because it's the open standard. So, you know, we've seen it. We've seen a sort of roundabout, right? Like, is it bad? Is it illegal? Maybe it's illegal, whatever. I don't know about that.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I agree. I have a schizo take on how you can solve this because it already works. I have a reasonable take on it. Japan has a law which you're allowed to train on any training data and copyrights don't apply if you want to train a model. A. B. Japan has 9 gigawatts of curtailed nuclear power. C, Japan is allowed under the AI diffusion rule to import as many GPUs as they'd like.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I agree. I have a schizo take on how you can solve this because it already works. I have a reasonable take on it. Japan has a law which you're allowed to train on any training data and copyrights don't apply if you want to train a model. A. B. Japan has 9 gigawatts of curtailed nuclear power. C, Japan is allowed under the AI diffusion rule to import as many GPUs as they'd like.