Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Dylan Patel

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
3551 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

We've talked a lot about training language models. They are trained on text. In post-training, you're trying to train on very high-quality text that you want the model to match the features of, or if you're using RL, you're letting the model find its own thing. But for supervised fine-tuning, for preference data, you need to have some completions what the model is trying to learn to imitate.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And what you do there is instead of a human data or instead of the model you're currently training, you take completions from a different, normally more powerful model. I think there's rumors that these big models that people are waiting for, these GPT-5s of the world, the CLOD-3 opuses of the world are used internally to do this distillation process.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And what you do there is instead of a human data or instead of the model you're currently training, you take completions from a different, normally more powerful model. I think there's rumors that these big models that people are waiting for, these GPT-5s of the world, the CLOD-3 opuses of the world are used internally to do this distillation process.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And what you do there is instead of a human data or instead of the model you're currently training, you take completions from a different, normally more powerful model. I think there's rumors that these big models that people are waiting for, these GPT-5s of the world, the CLOD-3 opuses of the world are used internally to do this distillation process.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

This is a long, at least in the academic side and research side, it's a long history because you're trying to interpret OpenAI's rule. OpenAI's terms of service say that you cannot build a competitor with outputs from their models. Terms of service are different than a license, which are essentially a contract between organizations.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

This is a long, at least in the academic side and research side, it's a long history because you're trying to interpret OpenAI's rule. OpenAI's terms of service say that you cannot build a competitor with outputs from their models. Terms of service are different than a license, which are essentially a contract between organizations.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

This is a long, at least in the academic side and research side, it's a long history because you're trying to interpret OpenAI's rule. OpenAI's terms of service say that you cannot build a competitor with outputs from their models. Terms of service are different than a license, which are essentially a contract between organizations.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So if you have a terms of service on OpenAI's account, if I violate it, OpenAI can cancel my account. This is very different than like a license that says how you could use a downstream artifact. So a lot of it hinges on a word that is very unclear in the AI space, which is what is a competitor.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So if you have a terms of service on OpenAI's account, if I violate it, OpenAI can cancel my account. This is very different than like a license that says how you could use a downstream artifact. So a lot of it hinges on a word that is very unclear in the AI space, which is what is a competitor.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So if you have a terms of service on OpenAI's account, if I violate it, OpenAI can cancel my account. This is very different than like a license that says how you could use a downstream artifact. So a lot of it hinges on a word that is very unclear in the AI space, which is what is a competitor.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

There's also a clear loophole, which is that I generate data from open AI and then I upload it somewhere and then somebody else trains on it and the link has been broken. Like they're not under the same terms of service contract. There's a lot of hip hop. There's a lot of like to be discovered details that don't make a lot of sense.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

There's also a clear loophole, which is that I generate data from open AI and then I upload it somewhere and then somebody else trains on it and the link has been broken. Like they're not under the same terms of service contract. There's a lot of hip hop. There's a lot of like to be discovered details that don't make a lot of sense.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

There's also a clear loophole, which is that I generate data from open AI and then I upload it somewhere and then somebody else trains on it and the link has been broken. Like they're not under the same terms of service contract. There's a lot of hip hop. There's a lot of like to be discovered details that don't make a lot of sense.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

We have to do this if we serve a demo. We do research and we use OpenAI APIs because it's useful and we want to understand post-training. And like our research models, they will say they're written by OpenAI unless we put in the system prop that we talked about that like, I am Tulu. I am a language model trained by the Allen Institute for AI.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

We have to do this if we serve a demo. We do research and we use OpenAI APIs because it's useful and we want to understand post-training. And like our research models, they will say they're written by OpenAI unless we put in the system prop that we talked about that like, I am Tulu. I am a language model trained by the Allen Institute for AI.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

We have to do this if we serve a demo. We do research and we use OpenAI APIs because it's useful and we want to understand post-training. And like our research models, they will say they're written by OpenAI unless we put in the system prop that we talked about that like, I am Tulu. I am a language model trained by the Allen Institute for AI.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And if you ask more people around industry, especially with post-training, it's a very doable task to make the model say who it is or to suppress the OpenAI thing. So in some levels, it might be that DeepSeq didn't care that it was saying that it was by OpenAI.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And if you ask more people around industry, especially with post-training, it's a very doable task to make the model say who it is or to suppress the OpenAI thing. So in some levels, it might be that DeepSeq didn't care that it was saying that it was by OpenAI.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

And if you ask more people around industry, especially with post-training, it's a very doable task to make the model say who it is or to suppress the OpenAI thing. So in some levels, it might be that DeepSeq didn't care that it was saying that it was by OpenAI.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

If you're going to upload model weights, it doesn't really matter because anyone that's serving it in an application and cares a lot about serving is going to, when serving it, if they're using it for a specific task, they're going to tailor it to that. And it doesn't matter that it's saying it's ChatGPT.