On the Diminishing Returns of Expert Load Balancing in MoE LLM Serving
Hanfei Yu ⋅ Jinru Duan ⋅ Jiabin Luo ⋅ Hao Wang
Successful Page Load