Skip to yearly menu bar Skip to main content


Poster

TurboAttention: Efficient attention approximation for high throughputs llm

Hao Kang ⋅ Srikant Bharadwaj ⋅ James Hensman ⋅ Tushar Krishna ⋅ Victor Ruehle ⋅ Saravan Rajmohan

Abstract

Video

Chat is not available.