Jonathan Ross
๐ค SpeakerAppearances Over Time
Podcast Appearances
So what you do is you train a model, you use it to generate data, and then you train a model and you use it to generate data and you keep getting better and better and better. So you can sort of beat the scaling law problem.
So what you do is you train a model, you use it to generate data, and then you train a model and you use it to generate data and you keep getting better and better and better. So you can sort of beat the scaling law problem.
One quick hack to get past all of that in the stepping up is if there's a really good model already right here, just have it generate the data and you go right up to where it is. And that's what they did. It is true that they spent about six million or whatever it was on the training. They spent a lot more distilling or scraping the open AI model.
One quick hack to get past all of that in the stepping up is if there's a really good model already right here, just have it generate the data and you go right up to where it is. And that's what they did. It is true that they spent about six million or whatever it was on the training. They spent a lot more distilling or scraping the open AI model.
One quick hack to get past all of that in the stepping up is if there's a really good model already right here, just have it generate the data and you go right up to where it is. And that's what they did. It is true that they spent about six million or whatever it was on the training. They spent a lot more distilling or scraping the open AI model.
Correct. And all that said, they did a lot of really innovative things. That's what makes it so complicated, because on the one hand, they kind of just scraped the open AI model. On the other hand, they came up with some unique reinforcement learning techniques that are so similar. What did they do that was so impressive?
Correct. And all that said, they did a lot of really innovative things. That's what makes it so complicated, because on the one hand, they kind of just scraped the open AI model. On the other hand, they came up with some unique reinforcement learning techniques that are so similar. What did they do that was so impressive?
Correct. And all that said, they did a lot of really innovative things. That's what makes it so complicated, because on the one hand, they kind of just scraped the open AI model. On the other hand, they came up with some unique reinforcement learning techniques that are so similar. What did they do that was so impressive?
No, they came up with innovative stuff. But actually, the best way to describe it, have you ever taken a test before you got an answer right, and your professor marked it wrong. And then you go back to the professor and you have to argue with them and everything. And it's a pain, right?
No, they came up with innovative stuff. But actually, the best way to describe it, have you ever taken a test before you got an answer right, and your professor marked it wrong. And then you go back to the professor and you have to argue with them and everything. And it's a pain, right?
No, they came up with innovative stuff. But actually, the best way to describe it, have you ever taken a test before you got an answer right, and your professor marked it wrong. And then you go back to the professor and you have to argue with them and everything. And it's a pain, right?
Well, if there is only one answer, and it's a very simple answer, and you say, write that answer in this box, then there is no arguing. You either get it right or not, right? So what they did was, rather than having human beings check the output and say yes or no or whatever, what they did was they said, here's the box. There's literally some code to say here's a box.
Well, if there is only one answer, and it's a very simple answer, and you say, write that answer in this box, then there is no arguing. You either get it right or not, right? So what they did was, rather than having human beings check the output and say yes or no or whatever, what they did was they said, here's the box. There's literally some code to say here's a box.
Well, if there is only one answer, and it's a very simple answer, and you say, write that answer in this box, then there is no arguing. You either get it right or not, right? So what they did was, rather than having human beings check the output and say yes or no or whatever, what they did was they said, here's the box. There's literally some code to say here's a box.
I'll put the answer here and then check it. And if it's correct, we have the answer. If not, we don't. No need to involve a human. Completely automated. Can OpenAI not just do distillation on DeepSeq's model then? They don't need to because they're actually better still. They're a little bit better. They could, but why would they?
I'll put the answer here and then check it. And if it's correct, we have the answer. If not, we don't. No need to involve a human. Completely automated. Can OpenAI not just do distillation on DeepSeq's model then? They don't need to because they're actually better still. They're a little bit better. They could, but why would they?
I'll put the answer here and then check it. And if it's correct, we have the answer. If not, we don't. No need to involve a human. Completely automated. Can OpenAI not just do distillation on DeepSeq's model then? They don't need to because they're actually better still. They're a little bit better. They could, but why would they?
Or is that questionable doubt there? I don't think you have to disbelieve it because of the quality delta. However, why would they try and smuggle in GPUs when all they'd have to do is log into any cloud provider and rent GPUs? This is like the biggest gaming hole in the whole way that export control is done.
Or is that questionable doubt there? I don't think you have to disbelieve it because of the quality delta. However, why would they try and smuggle in GPUs when all they'd have to do is log into any cloud provider and rent GPUs? This is like the biggest gaming hole in the whole way that export control is done.
Or is that questionable doubt there? I don't think you have to disbelieve it because of the quality delta. However, why would they try and smuggle in GPUs when all they'd have to do is log into any cloud provider and rent GPUs? This is like the biggest gaming hole in the whole way that export control is done.