Skip to yearly menu bar Skip to main content


Poster

BatchLLM: Optimizing Large Batched LLM Inference with Global Prefix Sharing and Throughput-oriented Token Batching

Zhen Zheng · Xin Ji · Taosong Fang · Fanghao Zhou · Chuanjie Liu ·

Abstract

Chat is not available.