Skip to yearly menu bar Skip to main content


Oral Thu, May 21, 2026 • 1:45 PM – 2:00 PM PDT

Optimizing Deployment Configurations for LLM Inference

Sungmin Cho ⋅ Jaewon Lee ⋅ Chunqiang Tang ⋅ Yejin Lee ⋅ Geonhwa Jeong ⋅ ⋅ Scott Batura ⋅ ⋅ ⋅ ⋅ Sijia Chen ⋅ ⋅ Bradley Davis ⋅ Summer Deng ⋅ ⋅ Emad El-Haraty ⋅ ⋅ Lu Fang ⋅ Lu Fang ⋅ Joshua Fromm ⋅ ⋅ ⋅ Liangpeng Guo ⋅ ⋅ ⋅ Jianyu Huang ⋅ Aya Ibrahim ⋅ ⋅ Hongyi Jia ⋅ Changkyu Kim ⋅ ⋅ ⋅ ⋅ ⋅ Xiaozhu Meng ⋅ Vlad Tiberiu Mihailescu ⋅ ⋅ Maxim Naumov ⋅ Michal Ostrowski ⋅ ⋅ ⋅ Sarunya Pumma ⋅ ⋅ ⋅ Jeremy Francis Reizenstein ⋅ Rajasi Saha ⋅ ⋅ Zhan Shu ⋅ Ruan Silva ⋅ ⋅ Jon Swenson ⋅ ⋅ Chris Thi ⋅ ⋅ Yunfan Wang ⋅ Pengchao Wang ⋅ Wenchen Wang ⋅ ⋅ Bram Wasti ⋅ ⋅ ⋅ Jingyi Yang ⋅ ⋅ ⋅ Jing Zhang ⋅ Yi Zhen ⋅

Abstract

Log in and register to view live content