Skip to yearly menu bar Skip to main content


Poster

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

Shang Yang · Junxian Guo · Haotian Tang · Qinghao Hu · Guangxuan Xiao · Jiaming Tang · Yujun Lin · Zhijian Liu · Yao Lu · Song Han

Abstract

Video

Chat is not available.