Timezone: »
Convolutional Neural Networks (ConvNets) enable computers to excel on vision learning tasks such as image classification, object detection. Recently, real-time inference on live data is becoming more and more important. From a system perspective, it requires fast inference on each single, incoming data item (e.g. 1 image). Two main-stream distributed model serving paradigms – data parallelism and model parallelism – are not necessarily desirable here, because we cannot further split a single input data piece via data parallelism, and model parallelism introduces huge communication overhead. To achieve live data inference with low latency, we propose sensAI, a novel and generic approach that decouples a CNN model into disconnected subnets, each is responsible for predicting certain class(es). We call this new model distribution paradigm as class parallelism. Experimental results show that, sensAI achieves up to 18x faster inference on single input data item with no or negligible accuracy loss on CIFAR-10, CIFAR-100 and ImageNet-1K datasets.
Author Information
Guanhua Wang (UC Berkeley)
I am a Ph.D. student in the AMPLab / RISELab, at UC Berkeley, advised by Prof. Ion Stoica.
Zhuang Liu (UC Berkeley)
Brandon Hsieh (University of California, Berkeley)
Siyuan Zhuang (UC Berkeley)
Joseph Gonzalez (UC Berkeley)
Trevor Darrell (UC Berkeley)
Ion Stoica (UC Berkeley)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data »
08 Apr 12:00 AM Room
More from the Same Authors
-
2023 Poster: On Optimizing the Communication of Model Parallelism »
Yonghao Zhuang · · Lianmin Zheng · Zhuohan Li · Eric Xing · Qirong Ho · Joseph Gonzalez · Ion Stoica · Hao Zhang -
2021 Poster: Wavelet: Efficient DNN Training with Tick-Tock Scheduling »
Guanhua Wang · Kehan Wang · Kenan Jiang · XIANGJUN LI · Ion Stoica -
2021 Oral: Wavelet: Efficient DNN Training with Tick-Tock Scheduling »
Guanhua Wang · Kehan Wang · Kenan Jiang · XIANGJUN LI · Ion Stoica -
2020 Oral: Blink: Fast and Generic Collectives for Distributed ML »
Guanhua Wang · Shivaram Venkataraman · Amar Phanishayee · Nikhil Devanur · Jorgen Thelin · Ion Stoica -
2020 Poster: Blink: Fast and Generic Collectives for Distributed ML »
Guanhua Wang · Shivaram Venkataraman · Amar Phanishayee · Nikhil Devanur · Jorgen Thelin · Ion Stoica -
2020 Poster: Breaking the Memory Wall with Optimal Tensor Rematerialization »
Paras Jain · Ajay Jain · Aniruddha Nrusimha · Amir Gholami · Pieter Abbeel · Joseph Gonzalez · Kurt Keutzer · Ion Stoica -
2020 Oral: Breaking the Memory Wall with Optimal Tensor Rematerialization »
Paras Jain · Ajay Jain · Aniruddha Nrusimha · Amir Gholami · Pieter Abbeel · Joseph Gonzalez · Kurt Keutzer · Ion Stoica