Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Jonathan Ross

👤 Person
408 total appearances

Appearances Over Time

Podcast Appearances

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Today, there's this wonderful business selling mainframes with a pretty juicy margin because no one seems to want to enter that business. Training is a niche market with very high margins. And when I say niche, it's still going to be worth hundreds of billions a year. But inference is the larger market. And...

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Today, there's this wonderful business selling mainframes with a pretty juicy margin because no one seems to want to enter that business. Training is a niche market with very high margins. And when I say niche, it's still going to be worth hundreds of billions a year. But inference is the larger market. And...

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

I don't know that NVIDIA will ever see it this way, but I do think that those of us focusing on inference and building stuff specifically for that are probably the best thing that's ever happened for NVIDIA stock because we'll take on the low margin, high volume inference so that NVIDIA can keep its margins nice and high.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

I don't know that NVIDIA will ever see it this way, but I do think that those of us focusing on inference and building stuff specifically for that are probably the best thing that's ever happened for NVIDIA stock because we'll take on the low margin, high volume inference so that NVIDIA can keep its margins nice and high.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

I don't know that NVIDIA will ever see it this way, but I do think that those of us focusing on inference and building stuff specifically for that are probably the best thing that's ever happened for NVIDIA stock because we'll take on the low margin, high volume inference so that NVIDIA can keep its margins nice and high.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

No. And I was actually like, we raised some money late 2024. In that fundraise, we still had to explain to people why inference was going to be a larger business than training. Remember, this was our thesis when we started eight years ago. So for me, I struggle on why people think that training is going to be bigger. It just doesn't make sense.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

No. And I was actually like, we raised some money late 2024. In that fundraise, we still had to explain to people why inference was going to be a larger business than training. Remember, this was our thesis when we started eight years ago. So for me, I struggle on why people think that training is going to be bigger. It just doesn't make sense.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

No. And I was actually like, we raised some money late 2024. In that fundraise, we still had to explain to people why inference was going to be a larger business than training. Remember, this was our thesis when we started eight years ago. So for me, I struggle on why people think that training is going to be bigger. It just doesn't make sense.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Training is where you create the model. Inference is where you use the model. You want to become a heart surgeon, you spend years training, and then you spend more years practicing. Practicing is inference.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Training is where you create the model. Inference is where you use the model. You want to become a heart surgeon, you spend years training, and then you spend more years practicing. Practicing is inference.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Training is where you create the model. Inference is where you use the model. You want to become a heart surgeon, you spend years training, and then you spend more years practicing. Practicing is inference.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

what you're going to see is everyone else starting to use this MOE approach. Now, there's another thing that happens here.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

what you're going to see is everyone else starting to use this MOE approach. Now, there's another thing that happens here.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

what you're going to see is everyone else starting to use this MOE approach. Now, there's another thing that happens here.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Yeah, so MOE stands for mixture of experts. When you use LAMA 70 billion, you actually use every single parameter in that model. When you use Mixtrals 8x7b, you use two of the roughly 8b experts, but it's much smaller. And effectively, while it doesn't correlate exactly, it correlates very closely. The number of parameters effectively tells you how much compute you're performing.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Yeah, so MOE stands for mixture of experts. When you use LAMA 70 billion, you actually use every single parameter in that model. When you use Mixtrals 8x7b, you use two of the roughly 8b experts, but it's much smaller. And effectively, while it doesn't correlate exactly, it correlates very closely. The number of parameters effectively tells you how much compute you're performing.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Yeah, so MOE stands for mixture of experts. When you use LAMA 70 billion, you actually use every single parameter in that model. When you use Mixtrals 8x7b, you use two of the roughly 8b experts, but it's much smaller. And effectively, while it doesn't correlate exactly, it correlates very closely. The number of parameters effectively tells you how much compute you're performing.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Now, if I have, let's take the R1 model. I believe it's about 671 billion parameters versus 70 billion for LAMA. And there's a 405 billion dense model as well, right? But let's focus on 70 versus 671. I believe there's 256 experts, each of which is somewhere around 2 billion parameters.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Now, if I have, let's take the R1 model. I believe it's about 671 billion parameters versus 70 billion for LAMA. And there's a 405 billion dense model as well, right? But let's focus on 70 versus 671. I believe there's 256 experts, each of which is somewhere around 2 billion parameters.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Now, if I have, let's take the R1 model. I believe it's about 671 billion parameters versus 70 billion for LAMA. And there's a 405 billion dense model as well, right? But let's focus on 70 versus 671. I believe there's 256 experts, each of which is somewhere around 2 billion parameters.