Skip to yearly menu bar Skip to main content


(3 events)   Timezone:  
Show all
Toggle Poster Visibility
Poster
Tue May 14 09:00 AM -- 09:20 AM (PDT) @ Poster Position Number 15
AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration
Ji Lin · Jiaming Tang · Haotian Tang · Shang Yang · Wei-Ming Chen · Wei-Chen Wang · Guangxuan Xiao · Xingyu Dang · Chuang Gan · Song Han
Poster
Tue May 14 09:20 AM -- 09:40 AM (PDT) @ Poster Position Number 31
QMoE: Sub-1-Bit Compression of Trillion Parameter Models
Elias Frantar · Dan Alistarh
Poster
Tue May 14 09:40 AM -- 10:00 AM (PDT) @ Poster Position Number 13
Atom: Low-Bit Quantization for Efficient and Accurate LLM Serving
Yilong Zhao · Chien-Yu Lin · Kan Zhu · Zihao Ye · Lequn Chen · Size Zheng · Luis Ceze · Arvind Krishnamurthy · Tianqi Chen · Baris Kasikci
[ Slides