Skip to yearly menu bar Skip to main content


Poster

FastTree: Optimizing Attention Kernel and Runtime for Tree-Structured LLM Inference

Zaifeng Pan · Yitong Ding · Yue Guan · Zheng Wang · Zhongkai Yu · Xulong Tang · Yida Wang · Yufei Ding

Abstract

Video

Chat is not available.