Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI Podcast

混合张量专家数据并行方法优化混合专家训练

04 Jan 2025

Description

深入探讨 DeepSpeed-TED,一种新颖的三维混合并行框架,用于训练具有大型基础模型的混合专家模型。我们讨论了内存优化、通信优化以及与现有方法的性能比较。

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.