This reviews a document dated January 27, 2025, from Daniel and Michael at Unsloth, details their work on quantizing DeepSeek-R1's 671B parameter model, significantly reducing its size by 80% to 131GB while maintaining functionality. They achieved this dynamic quantization by selectively applying higher bitrates to crucial layers and lower bitrates to less sensitive MoE layers, contrasting with naive quantization methods that render the model unusable. The text explains how to run these quantized versions, discussing hardware requirements, performance benchmarks, and chat template considerations. It also offers a guide for local execution on various systems, including specific instructions for GPU and Apple devices, and outlines the use of Ollama/Open WebUISource: https://unsloth.ai/blog/deepseekr1-dynamic
No persons identified in this episode.
This episode hasn't been transcribed yet
Help us prioritize this episode for transcription by upvoting it.
Popular episodes get transcribed faster
Other recent transcribed episodes
Transcribed and ready to explore now
SpaceX Said to Pursue 2026 IPO
10 Dec 2025
Bloomberg Tech
Don’t Call It a Comeback
10 Dec 2025
Motley Fool Money
Japan Claims AGI, Pentagon Adopts Gemini, and MIT Designs New Medicines
10 Dec 2025
The Daily AI Show
Eric Larsen on the emergence and potential of AI in healthcare
10 Dec 2025
McKinsey on Healthcare
What it will take for AI to scale (energy, compute, talent)
10 Dec 2025
Azeem Azhar's Exponential View
Reducing Burnout and Boosting Revenue in ASCs
10 Dec 2025
Becker’s Healthcare -- Spine and Orthopedic Podcast