Skip to yearly menu bar Skip to main content


Poster

SiDA: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models

Zhixu Du ⋅ Shiyu Li ⋅ Yuhao Wu ⋅ Xiangyu Jiang ⋅ Jingwei Sun ⋅ Qilin Zheng ⋅ Yongkai Wu ⋅ Ang Li ⋅ Hai Li ⋅ Yiran Chen
2024 Poster
[ Slides

Abstract

Video

Chat is not available.