Skip to yearly menu bar Skip to main content


Poster

SOLA: Optimizing SLO Attainment for Large Language Model Serving with State-Aware Scheduling

Ke Hong ⋅ Xiuhong Li ⋅ Lufang Chen ⋅ Qiuli Mao ⋅ Guohao Dai ⋅ Xuefei Ning ⋅ Shengen Yan ⋅ Yun Liang ⋅ Yu Wang

Abstract

Video

Chat is not available.