MLSys Training with Multi-Layer Embeddings for Model Reduction

Contributed Talk 12
in
Workshop: Personalized Recommendation Systems and Algorithms

Training with Multi-Layer Embeddings for Model Reduction

Benjamin Ghaemmaghami · Zihao Deng

[ Abstract ]

Abstract:

Modern recommendation systems rely on real-valued embeddings of categorical features. Increasing the dimension of embedding vectors improves model accuracy but comes at a high cost to model size. We introduce a multi-layer embedding training (MLET) scheme that trains embeddings via a sequence of linear layers to derive a superior model accuracy vs. size trade-off. Our approach is fundamentally based on the ability of factorized linear layers to produce superior embeddings to that of a single linear layer. Harnessing recent results in dynamics of backpropagation in linear neural networks, we explain the superior performance obtained by multi-layer embeddings by their tendency to have lower effective rank. We show that substantial advantages are obtained in the regime where the width of the hidden layer is much larger than that of the final embedding vector dimension. Crucially, at the conclusion of training, we convert the two-layer solution into a single-layer one: as a result, the inference-time model size is unaffected by MLET. We prototype MLET across seven different open-source recommendation models. We show that it allows a reduction in vector dimension of up to 16x, and 5.8x on average, across the models. This reduction correspondingly improves inference memory footprint while preserving model accuracy.

Contributed Talk 12 in Workshop: Personalized Recommendation Systems and Algorithms

Training with Multi-Layer Embeddings for Model Reduction

Benjamin Ghaemmaghami · Zihao Deng

Contributed Talk 12
in
Workshop: Personalized Recommendation Systems and Algorithms