Dylan Patel
๐ค SpeakerAppearances Over Time
Podcast Appearances
They have GPT-4.0. They have OpenAI-01. And there's a lot of types of models. So we're going to break down what each of them are. There's a lot of technical specifics on training and go from high level to specific and kind of go through each of them.
They have GPT-4.0. They have OpenAI-01. And there's a lot of types of models. So we're going to break down what each of them are. There's a lot of technical specifics on training and go from high level to specific and kind of go through each of them.
They have GPT-4.0. They have OpenAI-01. And there's a lot of types of models. So we're going to break down what each of them are. There's a lot of technical specifics on training and go from high level to specific and kind of go through each of them.
Yeah, so this discussion has been going on for a long time in AI. It became more important since ChatGPT or more focal since ChatGPT at the end of 2022. Open weights is the accepted term for when model weights of a language model are available on the internet for people to download. Those weights can have different licenses, which is effectively the terms by which you can use the model.
Yeah, so this discussion has been going on for a long time in AI. It became more important since ChatGPT or more focal since ChatGPT at the end of 2022. Open weights is the accepted term for when model weights of a language model are available on the internet for people to download. Those weights can have different licenses, which is effectively the terms by which you can use the model.
Yeah, so this discussion has been going on for a long time in AI. It became more important since ChatGPT or more focal since ChatGPT at the end of 2022. Open weights is the accepted term for when model weights of a language model are available on the internet for people to download. Those weights can have different licenses, which is effectively the terms by which you can use the model.
There are licenses that come from history and open source software. There are licenses that are designed by companies specifically. All of Lama, DeepSeek, Quen, Mistral, these popular names in... open weight models have some of their own licenses. It's complicated because not all the same models have the same terms. The big debate is on what makes a model open weight. Why are we saying this term?
There are licenses that come from history and open source software. There are licenses that are designed by companies specifically. All of Lama, DeepSeek, Quen, Mistral, these popular names in... open weight models have some of their own licenses. It's complicated because not all the same models have the same terms. The big debate is on what makes a model open weight. Why are we saying this term?
There are licenses that come from history and open source software. There are licenses that are designed by companies specifically. All of Lama, DeepSeek, Quen, Mistral, these popular names in... open weight models have some of their own licenses. It's complicated because not all the same models have the same terms. The big debate is on what makes a model open weight. Why are we saying this term?
It's kind of a mouthful. It sounds close to open source, but it's not the same. there's still a lot of debate on the definition and soul of open source AI. Open source software has a rich history on freedom to modify, freedom to take on your own, freedom from any restrictions on how you would use the software, and what that means for AI is still being defined.
It's kind of a mouthful. It sounds close to open source, but it's not the same. there's still a lot of debate on the definition and soul of open source AI. Open source software has a rich history on freedom to modify, freedom to take on your own, freedom from any restrictions on how you would use the software, and what that means for AI is still being defined.
It's kind of a mouthful. It sounds close to open source, but it's not the same. there's still a lot of debate on the definition and soul of open source AI. Open source software has a rich history on freedom to modify, freedom to take on your own, freedom from any restrictions on how you would use the software, and what that means for AI is still being defined.
So for what I do, I work at the Allen Institute for AI. We're a nonprofit. We want to make AI open for everybody. And we try to lead on what we think is truly open source. There's not full agreement in the community. But for us, that means releasing the training data, releasing the training code, and then also having open weights like this. And we'll get into the details of the models and
So for what I do, I work at the Allen Institute for AI. We're a nonprofit. We want to make AI open for everybody. And we try to lead on what we think is truly open source. There's not full agreement in the community. But for us, that means releasing the training data, releasing the training code, and then also having open weights like this. And we'll get into the details of the models and
So for what I do, I work at the Allen Institute for AI. We're a nonprofit. We want to make AI open for everybody. And we try to lead on what we think is truly open source. There's not full agreement in the community. But for us, that means releasing the training data, releasing the training code, and then also having open weights like this. And we'll get into the details of the models and
Again and again, as we try to get deeper into how the models were trained, we will say things like the data processing, data filtering, data quality is the number one determinant of the model quality. And then a lot of the training code is the determinant on how long it takes to train and how fast your experimentation is.
Again and again, as we try to get deeper into how the models were trained, we will say things like the data processing, data filtering, data quality is the number one determinant of the model quality. And then a lot of the training code is the determinant on how long it takes to train and how fast your experimentation is.
Again and again, as we try to get deeper into how the models were trained, we will say things like the data processing, data filtering, data quality is the number one determinant of the model quality. And then a lot of the training code is the determinant on how long it takes to train and how fast your experimentation is.
So without fully open source models where you have access to this data, it is... hard to know, or it's harder to replicate. So we'll get into cost numbers for DeepSeq v3 on mostly GPU hours and how much you could pay to rent those yourselves. But without the data, the replication cost is going to be far, far higher. And same goes for the code.
So without fully open source models where you have access to this data, it is... hard to know, or it's harder to replicate. So we'll get into cost numbers for DeepSeq v3 on mostly GPU hours and how much you could pay to rent those yourselves. But without the data, the replication cost is going to be far, far higher. And same goes for the code.