Skip to yearly menu bar Skip to main content


Poster 18

Optimizing Deployment Configurations for LLM Inference

Sungmin Cho ⋅ Jaewon Lee ⋅ Chunqiang Tang ⋅ Yejin Lee ⋅ Geonhwa Jeong ⋅ Anca Agape ⋅ Scott Batura ⋅ Vincent Boivin ⋅ Stephen Chen ⋅ Renfei Chen ⋅ Sijia Chen ⋅ Yan Cui ⋅ Bradley Davis ⋅ Summer Deng ⋅ Nick Egebo ⋅ Emad El-Haraty ⋅ Sebastien Estienne ⋅ Lu Fang ⋅ Lu Fang ⋅ Joshua Fromm ⋅ Raj Ganapathy ⋅ Vedanuj Goswami ⋅ Liangpeng Guo ⋅ Ye Hu ⋅ Chenheli Hua ⋅ Jianyu Huang ⋅ Aya Ibrahim ⋅ Niranjan Jagannath ⋅ Hongyi Jia ⋅ Changkyu Kim ⋅ Shikai Li ⋅ Brandon Liu ⋅ Jiawen Liu ⋅ Ajit Mathews ⋅ Xiaozhu Meng ⋅ Vlad Tiberiu Mihailescu ⋅ Amit Nagpal ⋅ Maxim Naumov ⋅ Michal Ostrowski ⋅ Jialin Ouyang ⋅ Jason Park ⋅ Sarunya Pumma ⋅ Ye Qi ⋅ Zixi Qi ⋅ Jeremy Francis Reizenstein ⋅ Rajasi Saha ⋅ Nandhini Santhanam ⋅ Zhan Shu ⋅ Ruan Silva ⋅ Grigory Sizov ⋅ Jon Swenson ⋅ Brandon Taylor ⋅ Chris Thi ⋅ Adolfo Victoria ⋅ Yunfan Wang ⋅ Pengchao Wang ⋅ Wenchen Wang ⋅ Xiaodong Wang ⋅ Bram Wasti ⋅ Wei Xu ⋅ Qirui Yang ⋅ Jingyi Yang ⋅ Hector Yuen ⋅ Zhengyuan Zhang ⋅ Jing Zhang ⋅ Yi Zhen ⋅ Yanjun Zhou

Abstract

Log in and register to view live content