Skip to yearly menu bar Skip to main content


Poster

Lancet: Accelerating Mixture-of-Experts Training by Overlapping Weight Gradient Computation and All-to-All Communication

Chenyu Jiang ⋅ Ye Tian ⋅ Zhen Jia ⋅ Chuan Wu ⋅ Yida Wang ⋅ Shuai Zheng
2024 Poster
[ Slides

Abstract

Video

Chat is not available.