Jonathan Ross

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

And then it picks some small number, I'm forgetting which, maybe it's like eight of those or 16 of them, whatever it is. And so it only needs to do the compute for that. That means that you're getting to skip most of it, right? Sort of like your brain, like not every neuron in your brain fires when I say something to you about the stock market, right?

2368.911 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

And then it picks some small number, I'm forgetting which, maybe it's like eight of those or 16 of them, whatever it is. And so it only needs to do the compute for that. That means that you're getting to skip most of it, right? Sort of like your brain, like not every neuron in your brain fires when I say something to you about the stock market, right?

2368.911 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

And then it picks some small number, I'm forgetting which, maybe it's like eight of those or 16 of them, whatever it is. And so it only needs to do the compute for that. That means that you're getting to skip most of it, right? Sort of like your brain, like not every neuron in your brain fires when I say something to you about the stock market, right?

2368.911 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Like the neurons about, you know, playing football, right? those don't kick off, right? That's the intuition there. Previously, it was famously reported that OpenAI's GPT-4, it started off with something like 16 experts and they got it down to eight. I forget the numbers, but it started off larger and they shrunk it a little and they were smaller or whatever.

2387.066 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Like the neurons about, you know, playing football, right? those don't kick off, right? That's the intuition there. Previously, it was famously reported that OpenAI's GPT-4, it started off with something like 16 experts and they got it down to eight. I forget the numbers, but it started off larger and they shrunk it a little and they were smaller or whatever.

2387.066 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Like the neurons about, you know, playing football, right? those don't kick off, right? That's the intuition there. Previously, it was famously reported that OpenAI's GPT-4, it started off with something like 16 experts and they got it down to eight. I forget the numbers, but it started off larger and they shrunk it a little and they were smaller or whatever.

2387.066 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

And then with what's happened with DeepSeq model is they've gone the opposite. They've gone to a very large number of experts. The more parameters you have, it's like having more neurons. It's easier to retain the information that comes in. And so by having more parameters, they're able to, on a smaller amount of data, get good.

2410.234 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

And then with what's happened with DeepSeq model is they've gone the opposite. They've gone to a very large number of experts. The more parameters you have, it's like having more neurons. It's easier to retain the information that comes in. And so by having more parameters, they're able to, on a smaller amount of data, get good.

2410.234 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

And then with what's happened with DeepSeq model is they've gone the opposite. They've gone to a very large number of experts. The more parameters you have, it's like having more neurons. It's easier to retain the information that comes in. And so by having more parameters, they're able to, on a smaller amount of data, get good.

2410.234 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

However, because it's sparse, because it's a mixture of experts, they're not doing as much computation. And part of the cleverness was figuring out how they could have so many experts so it could be so sparse so they could skip so many of the parameters.

2431.425 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

However, because it's sparse, because it's a mixture of experts, they're not doing as much computation. And part of the cleverness was figuring out how they could have so many experts so it could be so sparse so they could skip so many of the parameters.

2431.425 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

However, because it's sparse, because it's a mixture of experts, they're not doing as much computation. And part of the cleverness was figuring out how they could have so many experts so it could be so sparse so they could skip so many of the parameters.

2431.425 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

outperformed their 405. What was surprising to me, I thought they retrained it from scratch. It turns out you read the paper and they talk about how they just fine tuned. So they used a relatively small amount of data to make it much better. Again, this goes to the quality of the data. They have higher quality data. They took their old model. They trained it, got much better.

2471.483 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

outperformed their 405. What was surprising to me, I thought they retrained it from scratch. It turns out you read the paper and they talk about how they just fine tuned. So they used a relatively small amount of data to make it much better. Again, this goes to the quality of the data. They have higher quality data. They took their old model. They trained it, got much better.

2471.483 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

outperformed their 405. What was surprising to me, I thought they retrained it from scratch. It turns out you read the paper and they talk about how they just fine tuned. So they used a relatively small amount of data to make it much better. Again, this goes to the quality of the data. They have higher quality data. They took their old model. They trained it, got much better.

2471.483 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

But that 70B, that new 70B outperforms their previous 405B. What you're going to see now is now that everyone has seen this deep seek architecture, they're going to go, great, I have hundreds of thousands of GPUs. I'm now going to use a lot of them to create a lot of synthetic data. And then I'm going to train the bejesus out of this model.

2490.25 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

But that 70B, that new 70B outperforms their previous 405B. What you're going to see now is now that everyone has seen this deep seek architecture, they're going to go, great, I have hundreds of thousands of GPUs. I'm now going to use a lot of them to create a lot of synthetic data. And then I'm going to train the bejesus out of this model.

2490.25 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

But that 70B, that new 70B outperforms their previous 405B. What you're going to see now is now that everyone has seen this deep seek architecture, they're going to go, great, I have hundreds of thousands of GPUs. I'm now going to use a lot of them to create a lot of synthetic data. And then I'm going to train the bejesus out of this model.

2490.25 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Because the other thing is, while it's sort of asymptotes, the question is, on this curve, where do you stop? It depends on how many people you have doing inference. You can either make the model bigger, which makes it more expensive, and then you train it on less. Or you make it smaller, and it's cheaper to run, but you have to train it more. So DeepSeq didn't have a lot of users until recently.

2510.97 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Because the other thing is, while it's sort of asymptotes, the question is, on this curve, where do you stop? It depends on how many people you have doing inference. You can either make the model bigger, which makes it more expensive, and then you train it on less. Or you make it smaller, and it's cheaper to run, but you have to train it more. So DeepSeq didn't have a lot of users until recently.

2510.97 View full episode →

Appearances Over Time

Podcast Appearances

Login Required