Skip to yearly menu bar Skip to main content


Poster

Reducing Activation Recomputation in Large Transformer Models

Vijay Anand Korthikanti · Jared Casper · Sangkug Lym · Lawrence McAfee · Michael Andersch · Mohammad Shoeybi · Bryan Catanzaro
[ Paper [ Poster

Abstract

Video

Chat is not available.