Skip to yearly menu bar Skip to main content


Poster

Reducing Activation Recomputation in Large Transformer Models

Vijay Anand Korthikanti ⋅ Jared Casper ⋅ Sangkug Lym ⋅ Lawrence McAfee ⋅ Michael Andersch ⋅ Mohammad Shoeybi ⋅ Bryan Catanzaro
[ Paper [ Poster

Abstract

Video

Chat is not available.