Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Dylan Patel

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
3551 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They have GPT-4.0. They have OpenAI-01. And there's a lot of types of models. So we're going to break down what each of them are. There's a lot of technical specifics on training and go from high level to specific and kind of go through each of them.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They have GPT-4.0. They have OpenAI-01. And there's a lot of types of models. So we're going to break down what each of them are. There's a lot of technical specifics on training and go from high level to specific and kind of go through each of them.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They have GPT-4.0. They have OpenAI-01. And there's a lot of types of models. So we're going to break down what each of them are. There's a lot of technical specifics on training and go from high level to specific and kind of go through each of them.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so this discussion has been going on for a long time in AI. It became more important since ChatGPT or more focal since ChatGPT at the end of 2022. Open weights is the accepted term for when model weights of a language model are available on the internet for people to download. Those weights can have different licenses, which is effectively the terms by which you can use the model.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so this discussion has been going on for a long time in AI. It became more important since ChatGPT or more focal since ChatGPT at the end of 2022. Open weights is the accepted term for when model weights of a language model are available on the internet for people to download. Those weights can have different licenses, which is effectively the terms by which you can use the model.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so this discussion has been going on for a long time in AI. It became more important since ChatGPT or more focal since ChatGPT at the end of 2022. Open weights is the accepted term for when model weights of a language model are available on the internet for people to download. Those weights can have different licenses, which is effectively the terms by which you can use the model.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

There are licenses that come from history and open source software. There are licenses that are designed by companies specifically. All of Lama, DeepSeek, Quen, Mistral, these popular names in... open weight models have some of their own licenses. It's complicated because not all the same models have the same terms. The big debate is on what makes a model open weight. Why are we saying this term?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

There are licenses that come from history and open source software. There are licenses that are designed by companies specifically. All of Lama, DeepSeek, Quen, Mistral, these popular names in... open weight models have some of their own licenses. It's complicated because not all the same models have the same terms. The big debate is on what makes a model open weight. Why are we saying this term?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

There are licenses that come from history and open source software. There are licenses that are designed by companies specifically. All of Lama, DeepSeek, Quen, Mistral, these popular names in... open weight models have some of their own licenses. It's complicated because not all the same models have the same terms. The big debate is on what makes a model open weight. Why are we saying this term?

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's kind of a mouthful. It sounds close to open source, but it's not the same. there's still a lot of debate on the definition and soul of open source AI. Open source software has a rich history on freedom to modify, freedom to take on your own, freedom from any restrictions on how you would use the software, and what that means for AI is still being defined.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's kind of a mouthful. It sounds close to open source, but it's not the same. there's still a lot of debate on the definition and soul of open source AI. Open source software has a rich history on freedom to modify, freedom to take on your own, freedom from any restrictions on how you would use the software, and what that means for AI is still being defined.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's kind of a mouthful. It sounds close to open source, but it's not the same. there's still a lot of debate on the definition and soul of open source AI. Open source software has a rich history on freedom to modify, freedom to take on your own, freedom from any restrictions on how you would use the software, and what that means for AI is still being defined.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So for what I do, I work at the Allen Institute for AI. We're a nonprofit. We want to make AI open for everybody. And we try to lead on what we think is truly open source. There's not full agreement in the community. But for us, that means releasing the training data, releasing the training code, and then also having open weights like this. And we'll get into the details of the models and

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So for what I do, I work at the Allen Institute for AI. We're a nonprofit. We want to make AI open for everybody. And we try to lead on what we think is truly open source. There's not full agreement in the community. But for us, that means releasing the training data, releasing the training code, and then also having open weights like this. And we'll get into the details of the models and

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So for what I do, I work at the Allen Institute for AI. We're a nonprofit. We want to make AI open for everybody. And we try to lead on what we think is truly open source. There's not full agreement in the community. But for us, that means releasing the training data, releasing the training code, and then also having open weights like this. And we'll get into the details of the models and

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Again and again, as we try to get deeper into how the models were trained, we will say things like the data processing, data filtering, data quality is the number one determinant of the model quality. And then a lot of the training code is the determinant on how long it takes to train and how fast your experimentation is.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Again and again, as we try to get deeper into how the models were trained, we will say things like the data processing, data filtering, data quality is the number one determinant of the model quality. And then a lot of the training code is the determinant on how long it takes to train and how fast your experimentation is.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Again and again, as we try to get deeper into how the models were trained, we will say things like the data processing, data filtering, data quality is the number one determinant of the model quality. And then a lot of the training code is the determinant on how long it takes to train and how fast your experimentation is.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So without fully open source models where you have access to this data, it is... hard to know, or it's harder to replicate. So we'll get into cost numbers for DeepSeq v3 on mostly GPU hours and how much you could pay to rent those yourselves. But without the data, the replication cost is going to be far, far higher. And same goes for the code.

Lex Fridman Podcast
#459 โ€“ DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So without fully open source models where you have access to this data, it is... hard to know, or it's harder to replicate. So we'll get into cost numbers for DeepSeq v3 on mostly GPU hours and how much you could pay to rent those yourselves. But without the data, the replication cost is going to be far, far higher. And same goes for the code.