Skip to yearly menu bar Skip to main content


Poster

FlexInfer: Flexible LLM Inference with CPU Computations

Seonjin Na ⋅ Geonhwa Jeong ⋅ Byung Hoon Ahn ⋅ Aaron Jezghani ⋅ Jeffrey Young ⋅ Christopher Hughes ⋅ Tushar Krishna ⋅ Hyesoon Kim

Abstract

Video

Chat is not available.