Skip to yearly menu bar Skip to main content


(4 events)   Timezone:  
Show all
Toggle Poster Visibility
Poster
Tue May 14 01:30 PM -- 01:50 PM (PDT) @ Poster Position Number 26
Q-Hitter: A Better Token Oracle for Efficient LLM Inference via Sparse-Quantized KV Cache
Zhenyu Zhang · Shiwei Liu · Runjin Chen · Bhavya Kailkhura · Beidi Chen · Atlas Wang
Poster
Tue May 14 01:50 PM -- 02:10 PM (PDT) @ Poster Position Number 11
Fine-Tuning Language Models Using Formal Methods Feedback: A Use Case in Autonomous Systems
Yunhao Yang · Neel P. Bhatt · Tyler Ingebrand · William Ward · Steven Carr · Atlas Wang · Ufuk Topcu
[ Slides
Poster
Tue May 14 02:20 PM -- 02:40 PM (PDT) @ Poster Position Number 34
Punica: Multi-Tenant LoRA Serving
Lequn Chen · Zihao Ye · Yongji Wu · Danyang Zhuo · Luis Ceze · Arvind Krishnamurthy
[ Slides
Poster
Tue May 14 02:40 PM -- 03:00 PM (PDT) @ Poster Position Number 9
SLoRA: Scalable Serving of Thousands of LoRA Adapters
Ying Sheng · Shiyi Cao · Dacheng Li · Coleman Hooper · Nicholas Lee · Shuo Yang · Christopher Chou · Banghua Zhu · Lianmin Zheng · Kurt Keutzer · Joseph Gonzalez · Ion Stoica