Skip to yearly menu bar Skip to main content


(86 events)   Timezone:  
Toggle Poster Visibility
Registration Desk
Mon May 13 07:00 AM -- 05:00 PM (PDT) @ Mission City Lobby None
Registration Check-in Desk
Break
Mon May 13 08:30 AM -- 09:00 AM (PDT) @ Mission City B1-B3 & MR1-MR3 None
Breakfast (for Young Professionals Only)
Opening Remarks (Young Professional Symposium)
Mon May 13 09:00 AM -- 09:05 AM (PDT) @ Mission B4 & B5 None
Opening Remarks
Invited Talk
Mon May 13 09:05 AM -- 10:05 AM (PDT) None
GenAI Efficiency is About More than Models
Kurt Keutzer
[ Slides
Talk
Mon May 13 10:10 AM -- 10:30 AM (PDT) @ Mission B4 & B5 None
Scaling Intelligence
Azalia Mirhoseini
Talk
Mon May 13 10:30 AM -- 10:50 AM (PDT) @ Mission B4 & B8 None
Towards Fast and Affordable Serving Systems for Large Language Models
Talk
Mon May 13 10:50 AM -- 11:10 AM (PDT) @ Mission B4 & B7 None
Memory-Efficient LLM Training
Jiawei Zhao
[ Slides
Break
Mon May 13 11:10 AM -- 11:30 AM (PDT) @ Mission City B1-B3 & MR1-MR3 None
Coffee Break
Talk
Mon May 13 11:30 AM -- 11:50 AM (PDT) @ Mission B4 & B9 None
Fairness in LLM Serving
Ying Sheng
Talk
Mon May 13 11:50 AM -- 12:10 PM (PDT) @ Mission B4 & B6 None
Hardware-aware algorithms for sequence modeling
Talk
Mon May 13 12:10 PM -- 12:30 PM (PDT) @ Mission B4 & B5 None
The Unreasonable Power of Synthetics for Efficient Machine Learning
Dan Fu
Break
Mon May 13 12:30 PM -- 02:00 PM (PDT) None
Lunch Break
Lightning Talks
Mon May 13 02:00 PM -- 03:00 PM (PDT) @ Mission B4 & B5 None
Sponsor Lightning Talks
[ Slides
Break
Mon May 13 03:00 PM -- 03:30 PM (PDT) @ Mission City B1-B3 & MR1-MR3 None
Coffee Break
Round Table Discussion
Mon May 13 03:30 PM -- 04:15 PM (PDT) @ Mission City B1-B3 & MR1-MR3 None
Round Table
Panel
Mon May 13 04:15 PM -- 05:00 PM (PDT) @ Mission B4 & B5 None
Discussion Panel
Beidi Chen · Kurt Keutzer · Zhihao Jia · Haifeng Jin · Hui Guan · Tri Dao
Session
Mon May 13 05:00 PM -- 06:30 PM (PDT) @ Mission City B1-B3 & MR1-MR3 None
Student Poster Session
Registration Desk
Tue May 14 07:00 AM -- 05:00 PM (PDT) @ Mission City Lobby None
Registration Check-in Desk
Break
Tue May 14 08:30 AM -- 09:00 AM (PDT) @ Mission City B1-B3 & MR1-MR11 None
Coffee Break
Opening Remarks (Main Conference)
Tue May 14 08:45 AM -- 09:00 AM (PDT) @ Mission B4 & B5 None
Opening Remarks
Poster
Tue May 14 09:00 AM -- 09:20 AM (PDT) @ Poster Position Number 15
AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration
Ji Lin · Jiaming Tang · Haotian Tang · Shang Yang · Wei-Ming Chen · Wei-Chen Wang · Guangxuan Xiao · Xingyu Dang · Chuang Gan · Song Han
Session
Tue May 14 09:00 AM -- 10:00 AM (PDT) @ Mission B4 & B15 None
Quantization and Compression 1
Poster
Tue May 14 09:20 AM -- 09:40 AM (PDT) @ Poster Position Number 31
QMoE: Sub-1-Bit Compression of Trillion Parameter Models
Elias Frantar · Dan Alistarh
Poster
Tue May 14 09:40 AM -- 10:00 AM (PDT) @ Poster Position Number 13
Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving
Yilong Zhao · Chien-Yu Lin · Kan Zhu · Zihao Ye · Lequn Chen · Size Zheng · Luis Ceze · Arvind Krishnamurthy · Tianqi Chen · Baris Kasikci
[ Slides
Break
Tue May 14 10:00 AM -- 10:30 AM (PDT) @ Mission City B1-B3 & MR1-MR10 None
Coffee Break
Invited Talk
Tue May 14 10:30 AM -- 11:30 AM (PDT) None
Possible Impossibilities and Impossible Possibilities
Yejin Choi
Break
Tue May 14 11:30 AM -- 01:00 PM (PDT) None
Lunch Break
Poster
Tue May 14 01:30 PM -- 01:50 PM (PDT) @ Poster Position Number 26
Q-Hitter: A Better Token Oracle for Efficient LLM Inference via Sparse-Quantized KV Cache
Zhenyu Zhang · Shiwei Liu · Runjin Chen · Bhavya Kailkhura · Beidi Chen · Atlas Wang
Session
Tue May 14 01:30 PM -- 03:00 PM (PDT) @ Mission B4 & B14 None
Large Language Models 1
Poster
Tue May 14 01:50 PM -- 02:10 PM (PDT) @ Poster Position Number 11
Fine-Tuning Language Models Using Formal Methods Feedback: A Use Case in Autonomous Systems
Yunhao Yang · Neel P. Bhatt · Tyler Ingebrand · William Ward · Steven Carr · Atlas Wang · Ufuk Topcu
[ Slides
Poster
Tue May 14 02:20 PM -- 02:40 PM (PDT) @ Poster Position Number 34
Punica: Multi-Tenant LoRA Serving
Lequn Chen · Zihao Ye · Yongji Wu · Danyang Zhuo · Luis Ceze · Arvind Krishnamurthy
[ Slides
Poster
Tue May 14 02:40 PM -- 03:00 PM (PDT) @ Poster Position Number 9
SLoRA: Scalable Serving of Thousands of LoRA Adapters
Ying Sheng · Shiyi Cao · Dacheng Li · Coleman Hooper · Nicholas Lee · Shuo Yang · Christopher Chou · Banghua Zhu · Lianmin Zheng · Kurt Keutzer · Joseph Gonzalez · Ion Stoica
Break
Tue May 14 03:00 PM -- 03:30 PM (PDT) @ Mission City B1-B3 & MR1-MR9 None
Coffee Break
Poster
Tue May 14 03:30 PM -- 03:50 PM (PDT) @ Poster Position Number 32
DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines
Ye Tian · Zhen Jia · Ziyue Luo · Yida Wang · Chuan Wu
[ Slides
Session
Tue May 14 03:30 PM -- 04:30 PM (PDT) @ Mission B4 & B13 None
Parallel and Distributed 1
Poster
Tue May 14 03:50 PM -- 04:10 PM (PDT) @ Poster Position Number 12
Distributed Matrix-Based Sampling for Graph Neural Network Training
Alok Tripathy · Katherine Yelick · Aydin Buluc
Poster
Tue May 14 04:10 PM -- 04:30 PM (PDT) @ Poster Position Number 21
L-GreCo: Layerwise-adaptive Gradient Compression For Efficient Data-parallel Deep Learning
Ilia Markov · Kaveh Alim · Elias Frantar · Dan Alistarh
Poster
Tue May 14 04:30 PM -- 04:50 PM (PDT) @ Poster Position Number 20
Accelerating ReLU for MPC-Based Private Inference with a Communication-Efficient Sign Estimation
Kiwan Maeng · G. Edward Suh
Session
Tue May 14 04:30 PM -- 05:30 PM (PDT) @ Mission B4 & B12 None
Privacy and security
Poster
Tue May 14 04:50 PM -- 05:10 PM (PDT) @ Poster Position Number 24
ACCURATE LOW-DEGREE POLYNOMIAL APPROXIMATION OF NON-POLYNOMIAL OPERATORS FOR FAST PRIVATE INFERENCE IN HOMOMORPHIC ENCRYPTION
Jingtian Dang · Jianming Tong · Anupam Golder · Cong "Callie" Hao · Arijit Raychowdhury · Tushar Krishna
Poster
Tue May 14 05:10 PM -- 05:30 PM (PDT) @ Poster Position Number 27
Proteus: Preserving Model Confidentiality during Graph Optimizations
Yubo Gao · Maryam Haghifam · Christina Giannoula · Renbo Tu · Gennady Pekhimenko · Nandita Vijaykumar
Reception
Tue May 14 05:30 PM -- 08:00 PM (PDT) @ Mission City Ballroom None
Reception & Poster Session
Registration Desk
Wed May 15 07:00 AM -- 05:00 PM (PDT) @ Mission City Lobby None
Registration Check-in Desk
Break
Wed May 15 08:30 AM -- 09:00 AM (PDT) @ Mission City B1-B3 & MR1-MR8 None
Coffee Break
Poster
Wed May 15 09:00 AM -- 09:20 AM (PDT) @ Poster Position Number 33
FlashDecoding++: Faster Large Language Model Inference with Asynchronization, Flat GEMM Optimization, and Heuristics
Ke Hong · Guohao Dai · Jiaming Xu · Qiuli Mao · Xiuhong Li · Jun Liu · kangdi chen · Yuhan Dong · Yu Wang
Session
Wed May 15 09:00 AM -- 10:00 AM (PDT) @ Mission B4 & B11 None
LLM 2
Poster
Wed May 15 09:20 AM -- 09:40 AM (PDT) @ Poster Position Number 25
Prompt Cache: Modular Attention Reuse for Low-Latency Inference
In Gim · Guojun Chen · Seung-seob Lee · Nikhil Sarda · Anurag Khandelwal · Lin Zhong
Poster
Wed May 15 09:40 AM -- 10:00 AM (PDT) @ Poster Position Number 22
Keyformer: KV Cache reduction through key tokens selection for Efficient Generative Inference
Muhammad Adnan · Akhil Arunkumar · Gaurav Jain · Prashant Nair · Ilya Soloveychik · Purushotham Kamath
[ Slides
Break
Wed May 15 10:00 AM -- 10:30 AM (PDT) @ Mission City B1-B3 & MR1-MR7 None
Coffee Break
Invited Talk
Wed May 15 10:30 AM -- 11:30 AM (PDT) None
AI Robustness and Security in the Age of LLMs
J. Zico Kolter
Break
Wed May 15 11:30 AM -- 01:00 PM (PDT) None
Lunch Break
Poster
Wed May 15 01:30 PM -- 01:50 PM (PDT) @ Poster Position Number 8
JIT-Q: Just-in-time Quantization with Processing-In-Memory for Efficient ML Training
Mohamed Ibrahim · Shaizeen Aga · Ada Li · Suchita Pati · Mahzabeen Islam
[ Slides
Session
Wed May 15 01:30 PM -- 03:00 PM (PDT) @ Mission B4 & B10 None
Quantization and Compression 2
Poster
Wed May 15 01:50 PM -- 02:10 PM (PDT) @ Poster Position Number 23
Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design
Jian Meng · Yuan Liao · Anupreetham Anupreetham · Ahmed Hasssan · Shixing Yu · Han-sok Suh · Xiaofeng Hu · Jae-sun Seo
Poster
Wed May 15 02:20 PM -- 02:40 PM (PDT) @ Poster Position Number 18
Schrodinger's FP Training Neural Networks with Dynamic Floating-Point Containers
Milos Nikolic · Enrique Torres Sanchez · Jiahui Wang · Ali Hadi Zadeh · Mostafa Mahmoud · Ameer Abdelhadi · Kareem Ibrahim · Andreas Moshovos
[ Slides
Poster
Wed May 15 02:40 PM -- 03:00 PM (PDT) @ Poster Position Number 28
Efficient Post-training Quantization with FP8 Formats
Haihao Shen · Naveen Mellempudi · Xin He · Qun Gao · Chang Wang · Mengni Wang
Break
Wed May 15 03:00 PM -- 03:30 PM (PDT) @ Mission City B1-B3 & MR1-MR6 None
Coffee Break
Poster
Wed May 15 03:30 PM -- 03:50 PM (PDT) @ Poster Position Number 7
FedTrans: Efficient Federated Learning via Multi-Model Transformation
Yuxuan Zhu · Jiachen Liu · Mosharaf Chowdhury · Fan Lai
Session
Wed May 15 03:30 PM -- 04:30 PM (PDT) @ Mission B4 & B9 None
Federated Learning
Poster
Wed May 15 03:50 PM -- 04:10 PM (PDT) @ Poster Position Number 10
HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning
Gyudong Kim · Mehdi Ghasemi · Soroush Heidari · Seungryong Kim · Young Geun Kim · Sarma Vrudhula · Carole-Jean Wu
Poster
Wed May 15 04:10 PM -- 04:30 PM (PDT) @ Poster Position Number 3
LIFL: A Lightweight, Event-driven Serverless Platform for Federated Learning
Shixiong Qi · K. K. Ramakrishnan · Myungjin Lee
[ Slides
Poster
Wed May 15 04:30 PM -- 04:50 PM (PDT) @ Poster Position Number 19
Lancet: Accelerating Mixture-of-Experts Training by Overlapping Weight Gradient Computation and All-to-All Communication
Chenyu Jiang · Ye Tian · Zhen Jia · Chuan Wu · Yida Wang · Shuai Zheng
[ Slides
Session
Wed May 15 04:30 PM -- 05:30 PM (PDT) @ Mission B4 & B8 None
Parallel and Distributed 2
Poster
Wed May 15 04:50 PM -- 05:10 PM (PDT) @ Poster Position Number 36
Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large Scale Recommendation
Liang Luo · Buyun Zhang · Michael Tsang · Yinbin Ma · Ching-Hsiang Chu · Yuxin Chen · Shen Li · Yuchen Hao · Yanli Zhao · Guna Lakshminarayanan · Ellie Wen · Jongsoo Park · Dheevatsa Mudigere · Maxim Naumov
Poster
Wed May 15 05:10 PM -- 05:30 PM (PDT) @ Poster Position Number 37
HeteGen: Efficient Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
ZHAO XUANLEI · Bin Jia · Haotian Zhou · Ziming Liu · Shenggan Cheng · Yang You
Registration Desk
Thu May 16 07:00 AM -- 05:00 PM (PDT) @ Mission City Lobby None
Registration Check-in Desk
Break
Thu May 16 08:30 AM -- 09:00 AM (PDT) @ Mission City B1-B3 & MR1-MR5 None
Coffee Break
Poster
Thu May 16 09:00 AM -- 09:20 AM (PDT) @ Poster Position Number 14
vMCU: Coordinated Memory Management and Kernel Optimization for DNN Inference on MCUs
Size Zheng · Renze Chen · Meng Li · Zihao Ye · Luis Ceze · Yun Liang
[ Slides
Session
Thu May 16 09:00 AM -- 10:00 AM (PDT) @ Mission B4 & B7 None
Performance and Memory
Poster
Thu May 16 09:20 AM -- 09:40 AM (PDT) @ Poster Position Number 30
SiDA: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models
Zhixu Du · Shiyu Li · Yuhao Wu · Xiangyu Jiang · Jingwei Sun · Qilin Zheng · Yongkai Wu · Ang Li · Hai Li · Yiran Chen
[ Slides
Poster
Thu May 16 09:40 AM -- 10:00 AM (PDT) @ Poster Position Number 6
ACROBAT: Optimizing Auto-batching of Dynamic Deep Learning at Compile Time
Pratik Fegade · Tianqi Chen · Phillip Gibbons · Todd Mowry
[ Slides
Break
Thu May 16 10:00 AM -- 10:30 AM (PDT) @ Mission City B1-B3 & MR1-MR4 None
Coffee Break
Invited Talk
Thu May 16 10:30 AM -- 11:30 AM (PDT) None
Exciting Directions in Systems for Machine Learning
Jeff Dean
Break
Thu May 16 11:30 AM -- 01:00 PM (PDT) None
Lunch Break
Poster
Thu May 16 01:30 PM -- 01:50 PM (PDT) @ Poster Position Number 2
CloudEval-YAML: A Practical Benchmark for Cloud Configuration Generation
Yifei Xu · Yuning Chen · Xumiao Zhang · Xianshang Lin · Pan Hu · Yunfei Ma · Songwu Lu · Wan Du · Zhuoqing Mao · Ennan Zhai · Dennis Cai
Session
Thu May 16 01:30 PM -- 03:00 PM (PDT) @ Mission B4 & B6 None
Measurement and Analysis
Poster
Thu May 16 01:50 PM -- 02:10 PM (PDT) @ Poster Position Number 35
Does Compressing Activations Help Model Parallel Training?
Song Bian · Dacheng Li · Hongyi Wang · Eric Xing · Shivaram Venkataraman
Poster
Thu May 16 02:20 PM -- 02:40 PM (PDT) @ Poster Position Number 17
COMET: Neural Cost Model Explanation Framework
Isha Chaudhary · Alex Renda · Charith Mendis · Gagandeep Singh
Poster
Thu May 16 02:40 PM -- 03:00 PM (PDT) @ Poster Position Number 1
VIDUR: A LARGE-SCALE SIMULATION FRAMEWORK FOR LLM INFERENCE
Amey Agrawal · Nitin Kedia · Jayashree Mohan · Ashish Panwar · Nipun Kwatra · Bhargav Gulavani · Ramachandran Ramjee · Alexey Tumanov
Break
Thu May 16 03:00 PM -- 03:30 PM (PDT) @ Mission City B1-B3 & MR1-MR3 None
Coffee Break
Poster
Thu May 16 03:30 PM -- 03:50 PM (PDT) @ Poster Position Number 29
On Latency Predictors for Neural Architecture Search
Yash Akhauri · Mohamed Abdelfattah
Session
Thu May 16 03:30 PM -- 04:50 PM (PDT) @ Mission B4 & B5 None
ML for Systems
Poster
Thu May 16 03:50 PM -- 04:10 PM (PDT) @ Poster Position Number 4
FLASH: Fast Model Adaptation in ML-Centric Cloud Platforms
Haoran Qiu · Weichao Mao · Archit Patke · Shengkun Cui · Chen Wang · Hubertus Franke · Zbigniew Kalbarczyk · Tamer Basar · Ravi Iyer
[ Slides
Poster
Thu May 16 04:10 PM -- 04:30 PM (PDT) @ Poster Position Number 16
VQPy: An Object-Oriented Approach to Modern Video Analytics
Shan Yu · Zhenting Zhu · Yu Chen · Hanchen Xu · Pengzhan Zhao · Yang Wang · Arthi Padmanabhan · Hugo Latapie · Harry Xu
[ Slides
Poster
Thu May 16 04:30 PM -- 04:50 PM (PDT) @ Poster Position Number 5
UniDM: A Unified Framework for Data Manipulation with Large Language Models
Yichen Qian · Yongyi He · Rong Zhu · Jintao Huang · Zhijian Ma · Haibin Wang · Yaohua Wang · Xiuyu Sun · Defu Lian · Bolin Ding · Jingren Zhou
Closing Remarks
Thu May 16 05:00 PM -- 05:15 PM (PDT) @ Mission B4 & B5 None
Closing Remarks