Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Jonathan Ross

๐Ÿ‘ค Speaker
442 total appearances

Appearances Over Time

Podcast Appearances

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

No problem. But before we start, can I just say one thing? I think you have the most amazing, unique go-to-market that I've ever seen in my life for a podcast. I've never seen this before. I think your strategy is you're literally interviewing every single audience member, forcing them to watch videos and get addicted to you.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

No problem. But before we start, can I just say one thing? I think you have the most amazing, unique go-to-market that I've ever seen in my life for a podcast. I've never seen this before. I think your strategy is you're literally interviewing every single audience member, forcing them to watch videos and get addicted to you.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

No problem. But before we start, can I just say one thing? I think you have the most amazing, unique go-to-market that I've ever seen in my life for a podcast. I've never seen this before. I think your strategy is you're literally interviewing every single audience member, forcing them to watch videos and get addicted to you.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Well, my background, so I started the Google TPU, the AI chip that Google uses, and in 2016 started an AI chip startup called Grok with a Q, not with a K, that builds AI accelerator chips, which we call LPUs.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Well, my background, so I started the Google TPU, the AI chip that Google uses, and in 2016 started an AI chip startup called Grok with a Q, not with a K, that builds AI accelerator chips, which we call LPUs.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Well, my background, so I started the Google TPU, the AI chip that Google uses, and in 2016 started an AI chip startup called Grok with a Q, not with a K, that builds AI accelerator chips, which we call LPUs.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Yes, it's Sputnik. It is Sputnik 2.0. Even more so, you know that story about how NASA spent a million dollars designing a pen that could write in space and the Russians brought a pencil. That just happened again. So it's a huge deal. Why is it such a huge deal? So up until recently, the Chinese models have been behind sort of Western models.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Yes, it's Sputnik. It is Sputnik 2.0. Even more so, you know that story about how NASA spent a million dollars designing a pen that could write in space and the Russians brought a pencil. That just happened again. So it's a huge deal. Why is it such a huge deal? So up until recently, the Chinese models have been behind sort of Western models.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

Yes, it's Sputnik. It is Sputnik 2.0. Even more so, you know that story about how NASA spent a million dollars designing a pen that could write in space and the Russians brought a pencil. That just happened again. So it's a huge deal. Why is it such a huge deal? So up until recently, the Chinese models have been behind sort of Western models.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

And I say Western, including like Mistral as well and some other companies. And it was largely focused on how much compute you could get. Most people actually don't realize this. Most companies have access to roughly the same amount of data. They buy them from the same data providers and then just churn through that data with a GPU and they produce a model and then they deploy it.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

And I say Western, including like Mistral as well and some other companies. And it was largely focused on how much compute you could get. Most people actually don't realize this. Most companies have access to roughly the same amount of data. They buy them from the same data providers and then just churn through that data with a GPU and they produce a model and then they deploy it.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

And I say Western, including like Mistral as well and some other companies. And it was largely focused on how much compute you could get. Most people actually don't realize this. Most companies have access to roughly the same amount of data. They buy them from the same data providers and then just churn through that data with a GPU and they produce a model and then they deploy it.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

And they'll have some of their own data and that'll make them subtly better at one thing or another. But they're largely all the same. More GPUs, the better the model because you can train on more tokens. It's the scaling law. This model was supposedly trained on a smaller number of GPUs and a much, much tighter budget.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

And they'll have some of their own data and that'll make them subtly better at one thing or another. But they're largely all the same. More GPUs, the better the model because you can train on more tokens. It's the scaling law. This model was supposedly trained on a smaller number of GPUs and a much, much tighter budget.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

And they'll have some of their own data and that'll make them subtly better at one thing or another. But they're largely all the same. More GPUs, the better the model because you can train on more tokens. It's the scaling law. This model was supposedly trained on a smaller number of GPUs and a much, much tighter budget.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

I think the way that it's been put is less than the salary of many of the executives at Meta, and that's not true. There's an element of marketing involved in the DeepSea release. It is true that they train the model on approximately $6 million for the GPUs, right? They claim 2000

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

I think the way that it's been put is less than the salary of many of the executives at Meta, and that's not true. There's an element of marketing involved in the DeepSea release. It is true that they train the model on approximately $6 million for the GPUs, right? They claim 2000

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

I think the way that it's been put is less than the salary of many of the executives at Meta, and that's not true. There's an element of marketing involved in the DeepSea release. It is true that they train the model on approximately $6 million for the GPUs, right? They claim 2000

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

GPUs for, I think it was 60 days, which by the way, also don't forget was about the same amount of GPU time, 4,000 GPUs for 30 days as the original, I believe Lama 70. Now more recently, Meta has been training on more GPUs, but Meta hasn't been using as much good data as DeepSeq because DeepSeq was doing reinforcement learning using OpenAI.

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq

GPUs for, I think it was 60 days, which by the way, also don't forget was about the same amount of GPU time, 4,000 GPUs for 30 days as the original, I believe Lama 70. Now more recently, Meta has been training on more GPUs, but Meta hasn't been using as much good data as DeepSeq because DeepSeq was doing reinforcement learning using OpenAI.