Toggle Poster Visibility
Registration Desk
Mon May 13 07:00 AM -- 05:00 PM (PDT) @ Mission City Lobby None
Registration Check-in Desk
Break
Mon May 13 08:30 AM -- 09:00 AM (PDT) @ Mission City B1-B3 & MR1-MR3 None
Breakfast (for Young Professionals Only)
Opening Remarks (Young Professional Symposium)
Mon May 13 09:00 AM -- 09:05 AM (PDT) @ Mission B4 & B5 None
Opening Remarks
Invited Talk
Mon May 13 09:05 AM -- 10:05 AM (PDT) None
GenAI Efficiency is About More than Models
[
Slides]
Talk
Mon May 13 10:10 AM -- 10:30 AM (PDT) @ Mission B4 & B5 None
Scaling Intelligence
Talk
Mon May 13 10:30 AM -- 10:50 AM (PDT) @ Mission B4 & B8 None
Towards Fast and Affordable Serving Systems for Large Language Models
Talk
Mon May 13 10:50 AM -- 11:10 AM (PDT) @ Mission B4 & B7 None
Memory-Efficient LLM Training
[
Slides]
Break
Mon May 13 11:10 AM -- 11:30 AM (PDT) @ Mission City B1-B3 & MR1-MR3 None
Coffee Break
Talk
Mon May 13 11:30 AM -- 11:50 AM (PDT) @ Mission B4 & B9 None
Fairness in LLM Serving
Talk
Mon May 13 11:50 AM -- 12:10 PM (PDT) @ Mission B4 & B6 None
Hardware-aware algorithms for sequence modeling
Talk
Mon May 13 12:10 PM -- 12:30 PM (PDT) @ Mission B4 & B5 None
The Unreasonable Power of Synthetics for Efficient Machine Learning
Break
Mon May 13 12:30 PM -- 02:00 PM (PDT) None
Lunch Break
Lightning Talks
Mon May 13 02:00 PM -- 03:00 PM (PDT) @ Mission B4 & B5 None
Sponsor Lightning Talks
[
Slides]
Break
Mon May 13 03:00 PM -- 03:30 PM (PDT) @ Mission City B1-B3 & MR1-MR3 None
Coffee Break
Round Table Discussion
Mon May 13 03:30 PM -- 04:15 PM (PDT) @ Mission City B1-B3 & MR1-MR3 None
Round Table
Panel
Mon May 13 04:15 PM -- 05:00 PM (PDT) @ Mission B4 & B5 None
Discussion Panel
Session
Mon May 13 05:00 PM -- 06:30 PM (PDT) @ Mission City B1-B3 & MR1-MR3 None
Student Poster Session
Registration Desk
Tue May 14 07:00 AM -- 05:00 PM (PDT) @ Mission City Lobby None
Registration Check-in Desk
Break
Tue May 14 08:30 AM -- 09:00 AM (PDT) @ Mission City B1-B3 & MR1-MR11 None
Coffee Break
Opening Remarks (Main Conference)
Tue May 14 08:45 AM -- 09:00 AM (PDT) @ Mission B4 & B5 None
Opening Remarks
Poster
Tue May 14 09:00 AM -- 09:20 AM (PDT) @ Poster Position Number 15
AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration
Session
Tue May 14 09:00 AM -- 10:00 AM (PDT) @ Mission B4 & B15 None
Quantization and Compression 1
Poster
Tue May 14 09:20 AM -- 09:40 AM (PDT) @ Poster Position Number 31
QMoE: Sub-1-Bit Compression of Trillion Parameter Models
Poster
Tue May 14 09:40 AM -- 10:00 AM (PDT) @ Poster Position Number 13
Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving
[
Slides]
Break
Tue May 14 10:00 AM -- 10:30 AM (PDT) @ Mission City B1-B3 & MR1-MR10 None
Coffee Break
Invited Talk
Tue May 14 10:30 AM -- 11:30 AM (PDT) None
Possible Impossibilities and Impossible Possibilities
Break
Tue May 14 11:30 AM -- 01:00 PM (PDT) None
Lunch Break
Poster
Tue May 14 01:30 PM -- 01:50 PM (PDT) @ Poster Position Number 26
Q-Hitter: A Better Token Oracle for Efficient LLM Inference via Sparse-Quantized KV Cache
Session
Tue May 14 01:30 PM -- 03:00 PM (PDT) @ Mission B4 & B14 None
Large Language Models 1
Poster
Tue May 14 01:50 PM -- 02:10 PM (PDT) @ Poster Position Number 11
Fine-Tuning Language Models Using Formal Methods Feedback: A Use Case in Autonomous Systems
[
Slides]
Poster
Tue May 14 02:20 PM -- 02:40 PM (PDT) @ Poster Position Number 34
Punica: Multi-Tenant LoRA Serving
[
Slides]
Poster
Tue May 14 02:40 PM -- 03:00 PM (PDT) @ Poster Position Number 9
SLoRA: Scalable Serving of Thousands of LoRA Adapters
Break
Tue May 14 03:00 PM -- 03:30 PM (PDT) @ Mission City B1-B3 & MR1-MR9 None
Coffee Break
Poster
Tue May 14 03:30 PM -- 03:50 PM (PDT) @ Poster Position Number 32
DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines
[
Slides]
Session
Tue May 14 03:30 PM -- 04:30 PM (PDT) @ Mission B4 & B13 None
Parallel and Distributed 1
Poster
Tue May 14 03:50 PM -- 04:10 PM (PDT) @ Poster Position Number 12
Distributed Matrix-Based Sampling for Graph Neural Network Training
Poster
Tue May 14 04:10 PM -- 04:30 PM (PDT) @ Poster Position Number 21
L-GreCo: Layerwise-adaptive Gradient Compression For Efficient Data-parallel Deep Learning
Poster
Tue May 14 04:30 PM -- 04:50 PM (PDT) @ Poster Position Number 20
Accelerating ReLU for MPC-Based Private Inference with a Communication-Efficient Sign Estimation
Session
Tue May 14 04:30 PM -- 05:30 PM (PDT) @ Mission B4 & B12 None
Privacy and security
Poster
Tue May 14 04:50 PM -- 05:10 PM (PDT) @ Poster Position Number 24
ACCURATE LOW-DEGREE POLYNOMIAL APPROXIMATION OF NON-POLYNOMIAL OPERATORS FOR FAST PRIVATE INFERENCE IN HOMOMORPHIC ENCRYPTION
Poster
Tue May 14 05:10 PM -- 05:30 PM (PDT) @ Poster Position Number 27
Proteus: Preserving Model Confidentiality during Graph Optimizations
Reception
Tue May 14 05:30 PM -- 08:00 PM (PDT) @ Mission City Ballroom None
Reception & Poster Session
Registration Desk
Wed May 15 07:00 AM -- 05:00 PM (PDT) @ Mission City Lobby None
Registration Check-in Desk
Break
Wed May 15 08:30 AM -- 09:00 AM (PDT) @ Mission City B1-B3 & MR1-MR8 None
Coffee Break
Poster
Wed May 15 09:00 AM -- 09:20 AM (PDT) @ Poster Position Number 33
FlashDecoding++: Faster Large Language Model Inference with Asynchronization, Flat GEMM Optimization, and Heuristics
In
LLM 2
Session
Wed May 15 09:00 AM -- 10:00 AM (PDT) @ Mission B4 & B11 None
LLM 2
Poster
Wed May 15 09:20 AM -- 09:40 AM (PDT) @ Poster Position Number 25
Prompt Cache: Modular Attention Reuse for Low-Latency Inference
In
LLM 2
Break
Wed May 15 10:00 AM -- 10:30 AM (PDT) @ Mission City B1-B3 & MR1-MR7 None
Coffee Break
Invited Talk
Wed May 15 10:30 AM -- 11:30 AM (PDT) None
AI Robustness and Security in the Age of LLMs
Break
Wed May 15 11:30 AM -- 01:00 PM (PDT) None
Lunch Break
Poster
Wed May 15 01:30 PM -- 01:50 PM (PDT) @ Poster Position Number 8
JIT-Q: Just-in-time Quantization with Processing-In-Memory for Efficient ML Training
[
Slides]
Session
Wed May 15 01:30 PM -- 03:00 PM (PDT) @ Mission B4 & B10 None
Quantization and Compression 2
Poster
Wed May 15 01:50 PM -- 02:10 PM (PDT) @ Poster Position Number 23
Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design
Poster
Wed May 15 02:20 PM -- 02:40 PM (PDT) @ Poster Position Number 18
Schrodinger's FP Training Neural Networks with Dynamic Floating-Point Containers
[
Slides]
Poster
Wed May 15 02:40 PM -- 03:00 PM (PDT) @ Poster Position Number 28
Efficient Post-training Quantization with FP8 Formats
Break
Wed May 15 03:00 PM -- 03:30 PM (PDT) @ Mission City B1-B3 & MR1-MR6 None
Coffee Break
Poster
Wed May 15 03:30 PM -- 03:50 PM (PDT) @ Poster Position Number 7
FedTrans: Efficient Federated Learning via Multi-Model Transformation
Session
Wed May 15 03:30 PM -- 04:30 PM (PDT) @ Mission B4 & B9 None
Federated Learning
Poster
Wed May 15 03:50 PM -- 04:10 PM (PDT) @ Poster Position Number 10
HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning
Poster
Wed May 15 04:10 PM -- 04:30 PM (PDT) @ Poster Position Number 3
LIFL: A Lightweight, Event-driven Serverless Platform for Federated Learning
[
Slides]
Poster
Wed May 15 04:30 PM -- 04:50 PM (PDT) @ Poster Position Number 19
Lancet: Accelerating Mixture-of-Experts Training by Overlapping Weight Gradient Computation and All-to-All Communication
[
Slides]
Session
Wed May 15 04:30 PM -- 05:30 PM (PDT) @ Mission B4 & B8 None
Parallel and Distributed 2
Poster
Wed May 15 04:50 PM -- 05:10 PM (PDT) @ Poster Position Number 36
Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large Scale Recommendation
Poster
Wed May 15 05:10 PM -- 05:30 PM (PDT) @ Poster Position Number 37
HeteGen: Efficient Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
Registration Desk
Thu May 16 07:00 AM -- 05:00 PM (PDT) @ Mission City Lobby None
Registration Check-in Desk
Break
Thu May 16 08:30 AM -- 09:00 AM (PDT) @ Mission City B1-B3 & MR1-MR5 None
Coffee Break
Poster
Thu May 16 09:00 AM -- 09:20 AM (PDT) @ Poster Position Number 14
vMCU: Coordinated Memory Management and Kernel Optimization for DNN Inference on MCUs
[
Slides]
Session
Thu May 16 09:00 AM -- 10:00 AM (PDT) @ Mission B4 & B7 None
Performance and Memory
Poster
Thu May 16 09:20 AM -- 09:40 AM (PDT) @ Poster Position Number 30
SiDA: Sparsity-Inspired Data-Aware Serving for Efficient and Scalable Large Mixture-of-Experts Models
[
Slides]
Poster
Thu May 16 09:40 AM -- 10:00 AM (PDT) @ Poster Position Number 6
ACROBAT: Optimizing Auto-batching of Dynamic Deep Learning at Compile Time
[
Slides]
Break
Thu May 16 10:00 AM -- 10:30 AM (PDT) @ Mission City B1-B3 & MR1-MR4 None
Coffee Break
Invited Talk
Thu May 16 10:30 AM -- 11:30 AM (PDT) None
Exciting Directions in Systems for Machine Learning
Break
Thu May 16 11:30 AM -- 01:00 PM (PDT) None
Lunch Break
Poster
Thu May 16 01:30 PM -- 01:50 PM (PDT) @ Poster Position Number 2
CloudEval-YAML: A Practical Benchmark for Cloud Configuration Generation
Session
Thu May 16 01:30 PM -- 03:00 PM (PDT) @ Mission B4 & B6 None
Measurement and Analysis
Poster
Thu May 16 01:50 PM -- 02:10 PM (PDT) @ Poster Position Number 35
Does Compressing Activations Help Model Parallel Training?
Poster
Thu May 16 02:20 PM -- 02:40 PM (PDT) @ Poster Position Number 17
COMET: Neural Cost Model Explanation Framework
Poster
Thu May 16 02:40 PM -- 03:00 PM (PDT) @ Poster Position Number 1
VIDUR: A LARGE-SCALE SIMULATION FRAMEWORK FOR LLM INFERENCE
Break
Thu May 16 03:00 PM -- 03:30 PM (PDT) @ Mission City B1-B3 & MR1-MR3 None
Coffee Break
Poster
Thu May 16 03:30 PM -- 03:50 PM (PDT) @ Poster Position Number 29
On Latency Predictors for Neural Architecture Search
Session
Thu May 16 03:30 PM -- 04:50 PM (PDT) @ Mission B4 & B5 None
ML for Systems
Poster
Thu May 16 03:50 PM -- 04:10 PM (PDT) @ Poster Position Number 4
FLASH: Fast Model Adaptation in ML-Centric Cloud Platforms
[
Slides]
Poster
Thu May 16 04:10 PM -- 04:30 PM (PDT) @ Poster Position Number 16
VQPy: An Object-Oriented Approach to Modern Video Analytics
[
Slides]
Poster
Thu May 16 04:30 PM -- 04:50 PM (PDT) @ Poster Position Number 5
UniDM: A Unified Framework for Data Manipulation with Large Language Models
Closing Remarks
Thu May 16 05:00 PM -- 05:15 PM (PDT) @ Mission B4 & B5 None
Closing Remarks
Successful Page Load