Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI: post transformers

DeepSeek-R1 Dynamic 1.58-bit Quantization: A Performance Analysis

08 Aug 2025

Description

This reviews a document dated January 27, 2025, from Daniel and Michael at Unsloth, details their work on quantizing DeepSeek-R1's 671B parameter model, significantly reducing its size by 80% to 131GB while maintaining functionality. They achieved this dynamic quantization by selectively applying higher bitrates to crucial layers and lower bitrates to less sensitive MoE layers, contrasting with naive quantization methods that render the model unusable. The text explains how to run these quantized versions, discussing hardware requirements, performance benchmarks, and chat template considerations. It also offers a guide for local execution on various systems, including specific instructions for GPU and Apple devices, and outlines the use of Ollama/Open WebUISource: https://unsloth.ai/blog/deepseekr1-dynamic

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.