Communication-Efficient Distributed Inference for Transformer Models via Vector Quantized Context
Xiao Liu ⋅ Lijun Zhang ⋅ Deepak Ganesan ⋅ Hui Guan
Chat is not available.
Successful Page Load