Timezone: »

Wavelet: Efficient DNN Training with Tick-Tock Scheduling
Guanhua Wang · Kehan Wang · Kenan Jiang · XIANGJUN LI · Ion Stoica

Thu Apr 08 04:00 PM -- 04:20 PM (PDT) @

DNNs have revolutionized across a wide range of applications, such as image classification, speech recognition and robotics control. As DNN models become more computationally expensive to train, parallel execution with multiple accelerators (e.g. GPUs) is adopted. System efficiency is a big issue when scaling out. However, as computation power increases, GPUs are under-utilized mainly due to limited local memory size. To address this memory bound, we present Wavelet, an efficient and generic approach that can fully utilize all the available on-device memory among GPUs involved in the distributed training job. Wavelet achieves near optimal on-device memory usage by adopting a simple scheduling scheme called Tick-Tock, which interleaves waves of peak memory usage among the accelerators. Evaluations on a variety of DNN models and tasks show that, Wavelet trains models up to 6.7x faster than commonly used parallelism techniques.

Author Information

Guanhua Wang (UC Berkeley)

I am a Ph.D. student in the AMPLab / RISELab, at UC Berkeley, advised by Prof. Ion Stoica.

Kehan Wang (University of California, Berkeley)
Kenan Jiang (University of California, Berkeley)
Ion Stoica (UC Berkeley)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors