Skip to yearly menu bar Skip to main content


Poster

MTraining: Distributed Dynamic Sparse Attention for Efficient Ultra-Long Context Training

Wenxuan Li · Chengruidong Zhang · Huiqiang Jiang · Yucheng Li · · Lili Qiu

Abstract

Chat is not available.