Skip to yearly menu bar Skip to main content


Poster

FlexInfer: Flexible LLM Inference with CPU Computations

Seonjin Na · Geonhwa Jeong · Byung Hoon Ahn · Aaron Jezghani · Jeffrey Young · Christopher Hughes · Tushar Krishna · Hyesoon Kim

Abstract

Video

Chat is not available.