Skip to yearly menu bar Skip to main content


Poster 12

Efficient Long-Context Language Model Training by Core Attention Disaggregation

Yonghao Zhuang ⋅ Junda Chen ⋅ Bo Pang ⋅ Yi Gu ⋅ Yibo Zhu ⋅ Yimin Jiang ⋅ Ion Stoica ⋅ Hao Zhang ⋅ Eric Xing

Abstract

Log in and register to view live content