Skip to yearly menu bar Skip to main content


Poster

SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention

Qianchao Zhu · Jiangfei Duan · Chang Chen · Siran Liu · Xiuhong Li · Guanyu Feng · Xin Lv · Xiao Chuanfu · Dahua Lin · Chao Yang

Abstract

Video

Chat is not available.