Timezone: »
The memory capacity of embedding tables in deep learning recommendation models (DLRMs) is increasing dramatically from tens of GBs to TBs across the industry. Given the fast growth in DLRMs, novel solutions are urgently needed in order to enable DLRM innovations. At the same time, this must be done in a fast and efficient way without having to exponentially increase infrastructure capacity demands. In this paper, we demonstrate the promising potential of Tensor Train decomposition for DLRMs (TT-Rec), an important yet under-investigated context. We design and implement optimized kernels (TT-EmbeddingBag) to evaluate the proposed TT-Rec design. TT-EmbeddingBag is 3x faster than the SOTA TT implementation. The performance of TT-Rec is further optimized with the batched matrix multiplication and caching strategies for embedding vector lookup operations. In addition, we present mathematically and empirically the effect of weight initialization distribution on DLRM accuracy and propose to initialize the tensor cores of TT-Rec following the sampled Gaussian distribution. We evaluate TT-Rec across three important design space dimensions---memory capacity, accuracy, and timing performance---by training MLPerf-DLRM with Criteo's Kaggle and Terabyte data sets. TT-Rec compresses the model size by 4x to 221x for Kaggle, with 0.03% to 0.3% loss of accuracy correspondingly. For Terabyte, our approach achieves 112x model size reduction which comes with no accuracy loss nor training time overhead as compared to the uncompressed baseline.
Author Information
Chunxing Yin (Georgia Institute of Technology)
Bilge Acun (Facebook AI Research)
Carole-Jean Wu (Facebook AI)
Xing Liu (Facebook)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Oral: TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models »
Tue. Apr 6th through Wed the 7th Room
More from the Same Authors
-
2022 Poster: Sustainable AI: Environmental Implications, Challenges and Opportunities »
Carole-Jean Wu · Ramya Raghavendra · Udit Gupta · Bilge Acun · Newsha Ardalani · Kiwan Maeng · Gloria Chang · Fiona Aga · Jinshi Huang · Charles Bai · Michael Gschwind · Anurag Gupta · Myle Ott · Anastasia Melnikov · Salvatore Candido · David Brooks · Geeta Chauhan · Benjamin Lee · Hsien-Hsin Lee · Bugra Akyildiz · Maximilian Balandat · Joe Spisak · Ravi Jain · Mike Rabbat · Kim Hazelwood -
2022 Break: Closing Remarks »
Diana Marculescu · Yuejie Chi · Carole-Jean Wu -
2022 Oral: Sustainable AI: Environmental Implications, Challenges and Opportunities »
Carole-Jean Wu · Ramya Raghavendra · Udit Gupta · Bilge Acun · Newsha Ardalani · Kiwan Maeng · Gloria Chang · Fiona Aga · Jinshi Huang · Charles Bai · Michael Gschwind · Anurag Gupta · Myle Ott · Anastasia Melnikov · Salvatore Candido · David Brooks · Geeta Chauhan · Benjamin Lee · Hsien-Hsin Lee · Bugra Akyildiz · Maximilian Balandat · Joe Spisak · Ravi Jain · Mike Rabbat · Kim Hazelwood -
2022 Break: Opening Remarks »
Diana Marculescu · Yuejie Chi · Carole-Jean Wu -
2021 : Panel Session - Lizy John (UT Austin), David Kaeli (Northeastern University), Tushar Krishna (Georgia Tech), Peter Mattson (Google), Brian Van Essen (LLNL), Venkatram Vishwanath (ANL), Carole-Jean Wu (Facebook) »
Tom St John · LIZY JOHn · Tushar Krishna · Peter Mattson · Venkatram Vishwanath · Carole-Jean Wu · David Kaeli · Brian Van Essen -
2021 : Closing session »
Udit Gupta · Carole-Jean Wu -
2021 : "Designing and Optimizing AI Systems for Deep Learning Recommendation and Beyond" - Carole-Jean Wu (Facebook) »
Carole-Jean Wu -
2021 Workshop: Personalized Recommendation Systems and Algorithms »
Udit Gupta · Carole-Jean Wu · Gu-Yeon Wei · David Brooks -
2021 : Welcome to the 3rd PeRSonAl workshop »
Udit Gupta · Carole-Jean Wu -
2021 Poster: Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery »
Kiwan Maeng · Shivam Bharuka · Isabel Gao · Mark Jeffrey · Vikram Saraph · Bor-Yiing Su · Caroline Trippel · Jiyan Yang · Mike Rabbat · Brandon Lucia · Carole-Jean Wu -
2021 Oral: Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery »
Kiwan Maeng · Shivam Bharuka · Isabel Gao · Mark Jeffrey · Vikram Saraph · Bor-Yiing Su · Caroline Trippel · Jiyan Yang · Mike Rabbat · Brandon Lucia · Carole-Jean Wu -
2020 Oral: MLPerf Training Benchmark »
Peter Mattson · Christine Cheng · Gregory Diamos · Cody Coleman · Paulius Micikevicius · David Patterson · Hanlin Tang · Gu-Yeon Wei · Peter Bailis · Victor Bittorf · David Brooks · Dehao Chen · Debo Dutta · Udit Gupta · Kim Hazelwood · Andy Hock · Xinyuan Huang · Daniel Kang · David Kanter · Naveen Kumar · Jeffery Liao · Deepak Narayanan · Tayo Oguntebi · Gennady Pekhimenko · Lillian Pentecost · Vijay Janapa Reddi · Taylor Robie · Tom St John · Carole-Jean Wu · Lingjie Xu · Cliff Young · Matei Zaharia -
2020 Poster: MLPerf Training Benchmark »
Peter Mattson · Christine Cheng · Gregory Diamos · Cody Coleman · Paulius Micikevicius · David Patterson · Hanlin Tang · Gu-Yeon Wei · Peter Bailis · Victor Bittorf · David Brooks · Dehao Chen · Debo Dutta · Udit Gupta · Kim Hazelwood · Andy Hock · Xinyuan Huang · Daniel Kang · David Kanter · Naveen Kumar · Jeffery Liao · Deepak Narayanan · Tayo Oguntebi · Gennady Pekhimenko · Lillian Pentecost · Vijay Janapa Reddi · Taylor Robie · Tom St John · Carole-Jean Wu · Lingjie Xu · Cliff Young · Matei Zaharia