Shortcut-connected Expert Parallelism for Accelerating Mixture of Experts
Weilin Cai ⋅ Le Qin ⋅ Junwei Cui ⋅ Jiayi Huang
Successful Page Load