📉 Why are users walking away from one of the cheapest and smartest AI models out there? It's not a bug — it's a strategy.Just 150 days ago, DeepSeek R1 made waves. It matched OpenAI-level reasoning and launched with jaw-droppingly low pricing — just $0.055 for input and $2.19 for output tokens. It undercut the market leader by over 90%. OpenAI had to slash their flagship GPT-4 prices by 80% in response. It looked like DeepSeek had won.🤯 But then something strange happened: while usage of DeepSeek’s models exploded on third-party platforms like OpenRouter (a 20x increase!), traffic to DeepSeek’s own apps and APIs declined. Why are people avoiding the original, cheapest option?This episode dives deep into the hidden dynamics of AI economics — what we call “tokconomics”. It’s not just about the price per million tokens. It’s about the tradeoffs model providers make between:⚙️ Latency (time to first token)⚙️ Interactivity (tokens per second)⚙️ Context window (model’s memory span)💡 In this episode, you'll learn:— Why DeepSeek intentionally chose slower performance despite powerful models— How batching saves compute but worsens user experience— Why Anthropic (Claude) faces similar compute constraints — and how they’re solving it— What “intelligence per token” means — and how Claude delivers better answers in fewer words— How apps like Cursor, Replit, and Perplexity are built on token-based economics— Why tokens are becoming the new currency of AI infrastructure🎯 If you’re building with AI, investing in the space, or just trying to understand what’s under the hood — this episode is for you.🤔 Do you notice how fast or verbose your favorite AI is? Ever compared models side-by-side? Let us know in the comments!👇 Hit play now to decode the new economics of the AI future.Key Takeaways:DeepSeek R1 broke new ground in pricing, but sacrificed UX with high latencyUsers are flocking to third-party hosts with better performance using the same modelAI companies make strategic trade-offs between revenue, speed, and long-term AGI goals"Intelligence per token" is emerging as a new north star for model performanceSEO Tags:Niche: #tokconomics, #DeepSeekR1, #AGIstrategy, #AIlatencyPopular: #artificialintelligence, #GPT, #Anthropic, #Claude, #OpenAILong-tail: #whyDeepSeekislosingusers, #AIhighlatencyissues, #choosingthebestAImodelTrending: #tokens, #AIeconomics, #AGIraceRead more: https://semianalysis.com/2025/07/03/deepseek-debrief-128-days-later/
No persons identified in this episode.
This episode hasn't been transcribed yet
Help us prioritize this episode for transcription by upvoting it.
Popular episodes get transcribed faster
Other recent transcribed episodes
Transcribed and ready to explore now
Eric Larsen on the emergence and potential of AI in healthcare
10 Dec 2025
McKinsey on Healthcare
Reducing Burnout and Boosting Revenue in ASCs
10 Dec 2025
Becker’s Healthcare -- Spine and Orthopedic Podcast
Dr. Erich G. Anderer, Chief of the Division of Neurosurgery and Surgical Director of Perioperative Services at NYU Langone Hospital–Brooklyn
09 Dec 2025
Becker’s Healthcare -- Spine and Orthopedic Podcast
Dr. Nolan Wessell, Assistant Professor and Well-being Co-Director, Department of Orthopedic Surgery, Division of Spine Surgery, University of Colorado School of Medicine
08 Dec 2025
Becker’s Healthcare -- Spine and Orthopedic Podcast
NPR News: 12-08-2025 2AM EST
08 Dec 2025
NPR News Now
NPR News: 12-08-2025 1AM EST
08 Dec 2025
NPR News Now