Skip to yearly menu bar Skip to main content


Oral

Efficient Long-Context Language Model Training by Core Attention Disaggregation

Yonghao Zhuang ⋅ Junda Chen ⋅ ⋅ Yi Gu ⋅ Yibo Zhu ⋅ Yimin Jiang ⋅ Ion Stoica ⋅ Hao Zhang ⋅ Eric Xing

Abstract

Chat is not available.