Skip to yearly menu bar Skip to main content


Oral Thu, May 21, 2026 • 9:45 AM – 10:00 AM PDT

BatchLLM: Optimizing Large Batched LLM Inference with Global Prefix Sharing and Throughput-oriented Token Batching

Zhen Zheng ⋅ Xin Ji ⋅ Taosong Fang ⋅ Fanghao Zhou ⋅ Chuanjie Liu ⋅ Gang Peng

Abstract

Log in and register to view live content