Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI Podcast

AI Radio FM - Muon优化器深度解析

23 Feb 2025

Description

本期播客深入探讨了Muon优化器在大规模语言模型训练中的应用。Moonshot AI团队分享了他们如何通过添加权重衰减和调整参数更新尺度,成功将Muon扩展到3B/16B参数的MoE模型Moonlight的训练中。实验表明,与AdamW相比,Muon在计算效率上提高了约2倍。此外,播客还讨论了Muon的分布式实现,以及在预训练和监督微调阶段的表现。

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.