Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI Podcast

CCQ:压缩巨兽 - 两比特大语言模型的革命

14 Jul 2025

Description

本期播客深入探讨了名为CCQ(卷积码量化)的突破性技术。面对大型语言模型(LLMs)日益增长的部署成本和障碍,CCQ提出了一种创新的极低比特量化方案。我们将讨论CCQ如何通过结合卷积码、混合编码和码簇等技术,在几乎不损失模型精度的前提下,将模型压缩至2.0到2.75比特。同时,我们也会探讨其独特的免查找表和位移解码设计如何解决了传统矢量量化的推理速度瓶颈,并实现了在单个GPU上部署超大型模型(如文心4.5)的壮举。欢迎收听,了解这项可能改变大模型部署格局的黑科技。

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.