Skip to yearly menu bar Skip to main content


Poster

Lancet: Accelerating Mixture-of-Experts Training by Overlapping Weight Gradient Computation and All-to-All Communication

Chenyu Jiang · Ye Tian · Zhen Jia · Chuan Wu · Yida Wang · Shuai Zheng
2024 Poster
[ Slides

Abstract

Video

Chat is not available.