Skip to yearly menu bar Skip to main content


Poster 8

MTraining: Distributed Dynamic Sparse Attention for Efficient Ultra-Long Context Training

Wenxuan Li ⋅ Chengruidong Zhang ⋅ Huiqiang Jiang ⋅ Yucheng Li ⋅ Yuqing Yang ⋅ Lili Qiu

Abstract

Log in and register to view live content