Skip to yearly menu bar Skip to main content


Oral Thu, May 21, 2026 • 1:15 PM – 1:30 PM PDT

BatchLLM: Optimizing Large Batched LLM Inference with Global Prefix Sharing and Throughput-oriented Token Batching

Zhen Zheng ⋅ Xin Ji ⋅ Taosong Fang ⋅ Fanghao Zhou ⋅ Chuanjie Liu ⋅ Gang Peng

Abstract

Log in and register to view live content