Skip to yearly menu bar Skip to main content


Poster

SHIP: SRAM-Based Huge Inference Pipelines for Fast LLM Serving

Andrew Bitar ⋅ ⋅ Baorui Zhou ⋅ Matthew Boyd ⋅ Charlie Wang ⋅ ⋅ Eugene Sha ⋅ ⋅ ⋅ Alex Bowe ⋅ ⋅ Santosh Raghavan ⋅ ⋅ ⋅ ⋅ Kris Kang ⋅ ⋅ ⋅ Mohamed Eldafrawy ⋅ ⋅ ⋅ ⋅ ⋅ Andrew Paprotskyi ⋅ Arash Taheri-Dezfouli ⋅ ⋅ Andrew Ling

Abstract

Log in and register to view live content