Skip to yearly menu bar Skip to main content


Poster

HeteGen: Efficient Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices

ZHAO XUANLEI ⋅ Bin Jia ⋅ Haotian Zhou ⋅ Ziming Liu ⋅ Shenggan Cheng ⋅ Yang You
2024 Poster

Abstract

Video

Chat is not available.