Timezone: »
To accelerate CNN inference, existing deep learning frameworks focus on optimizing intra-operator parallelization. However, a single operator can no longer fully utilize the available parallelism given the rapid advances in high-performance hardware, resulting in a large gap between the peak performance and the real performance. This performance gap is more severe under smaller batch sizes. In this work, we extensively study the parallelism between operators and propose Inter-Operator Scheduler (IOS) to automatically schedule multiple operators' parallel execution through a novel dynamic programming algorithm. IOS consistently outperforms state-of-the-art libraries (e.g., TensorRT) by 1.1 to 1.5x on modern CNN benchmarks. The code to reproduce each experiment is available at: https://github.com/mit-han-lab/inter-operator-scheduler.
Author Information
Yaoyao Ding (University of Toronto)
Ligeng Zhu (MIT)
Zhihao Jia (Facebook)
Gennady Pekhimenko (University of Toronto)
Song Han (MIT)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: IOS: Inter-Operator Scheduler for CNN Acceleration »
08 Apr 12:00 AM Room Virtual
More from the Same Authors
-
2022 Poster: TorchSparse: Efficient Point Cloud Inference Engine »
Haotian Tang · Zhijian Liu · Xiuyu Li · Yujun Lin · Song Han -
2022 Poster: DietCode: Automatic Optimization for Dynamic Tensor Programs »
Bojian Zheng · Ziheng Jiang · Cody Hao Yu · Haichen Shen · Joshua Fromm · Yizhi Liu · Yida Wang · Luis Ceze · Tianqi Chen · Gennady Pekhimenko -
2023 Poster: Hotline Profiler: Automatic Annotation and A Multi-Scale Timeline for Visualizing Time-Use in DNN Training »
Daniel Snider · Fanny Chevalier · Gennady Pekhimenko -
2022 Symposium: Chips & Compilers »
Yida Wang · Gennady Pekhimenko -
2022 Oral: TorchSparse: Efficient Point Cloud Inference Engine »
Haotian Tang · Zhijian Liu · Xiuyu Li · Yujun Lin · Song Han -
2022 Oral: DietCode: Automatic Optimization for Dynamic Tensor Programs »
Bojian Zheng · Ziheng Jiang · Cody Hao Yu · Haichen Shen · Joshua Fromm · Yizhi Liu · Yida Wang · Luis Ceze · Tianqi Chen · Gennady Pekhimenko -
2021 : Industry/Academia Panel »
Zachary C Lipton · Udit Gupta · Lillian Pentecost · Shagun Sodhani · Abhishek Gupta · Mayoore Jaiswal · Michael Carbin · Devi Parikh · Gennady Pekhimenko -
2021 : "Machine Learning Tools: Skyline and RL-Scope" - Gennady Pekhimenko and James Gleeson (University of Toronto) »
Gennady Pekhimenko -
2021 Poster: Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models »
Shang Wang · Peiming Yang · Yuxuan Zheng · Xin Li · Gennady Pekhimenko -
2021 Poster: Boveda: Building an On-Chip Deep Learning Memory Hierarchy Brick by Brick »
Isak Edo Vivancos · Sayeh Sharify · Daniel Ly-Ma · Ameer Abdelhadi · Ciaran Bannon · Milos Nikolic · Mostafa Mahmoud · Alberto Delmas Lascorz · Gennady Pekhimenko · Andreas Moshovos -
2021 Oral: Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models »
Shang Wang · Peiming Yang · Yuxuan Zheng · Xin Li · Gennady Pekhimenko -
2021 Oral: Boveda: Building an On-Chip Deep Learning Memory Hierarchy Brick by Brick »
Isak Edo Vivancos · Sayeh Sharify · Daniel Ly-Ma · Ameer Abdelhadi · Ciaran Bannon · Milos Nikolic · Mostafa Mahmoud · Alberto Delmas Lascorz · Gennady Pekhimenko · Andreas Moshovos -
2021 Poster: RL-Scope: Cross-stack Profiling for Deep Reinforcement Learning Workloads »
James Gleeson · Srivatsan Krishnan · Moshe Gabel · Vijay Janapa Reddi · Eyal de Lara · Gennady Pekhimenko -
2021 Oral: RL-Scope: Cross-stack Profiling for Deep Reinforcement Learning Workloads »
James Gleeson · Srivatsan Krishnan · Moshe Gabel · Vijay Janapa Reddi · Eyal de Lara · Gennady Pekhimenko -
2020 Oral: MLPerf Training Benchmark »
Peter Mattson · Christine Cheng · Gregory Diamos · Cody Coleman · Paulius Micikevicius · David Patterson · Hanlin Tang · Gu-Yeon Wei · Peter Bailis · Victor Bittorf · David Brooks · Dehao Chen · Debo Dutta · Udit Gupta · Kim Hazelwood · Andy Hock · Xinyuan Huang · Daniel Kang · David Kanter · Naveen Kumar · Jeffery Liao · Deepak Narayanan · Tayo Oguntebi · Gennady Pekhimenko · Lillian Pentecost · Vijay Janapa Reddi · Taylor Robie · Tom St John · Carole-Jean Wu · Lingjie Xu · Cliff Young · Matei Zaharia -
2020 Oral: Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc »
Zhihao Jia · Sina Lin · Mingyu Gao · Matei Zaharia · Alex Aiken -
2020 Poster: Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc »
Zhihao Jia · Sina Lin · Mingyu Gao · Matei Zaharia · Alex Aiken -
2020 Poster: MLPerf Training Benchmark »
Peter Mattson · Christine Cheng · Gregory Diamos · Cody Coleman · Paulius Micikevicius · David Patterson · Hanlin Tang · Gu-Yeon Wei · Peter Bailis · Victor Bittorf · David Brooks · Dehao Chen · Debo Dutta · Udit Gupta · Kim Hazelwood · Andy Hock · Xinyuan Huang · Daniel Kang · David Kanter · Naveen Kumar · Jeffery Liao · Deepak Narayanan · Tayo Oguntebi · Gennady Pekhimenko · Lillian Pentecost · Vijay Janapa Reddi · Taylor Robie · Tom St John · Carole-Jean Wu · Lingjie Xu · Cliff Young · Matei Zaharia -
2020 Poster: BPPSA: Scaling Back-propagation by Parallel Scan Algorithm »
Shang Wang · Yifan Bai · Gennady Pekhimenko -
2020 Demonstration: Skyline: Interactive In-editor Performance Visualizations and Debugging for DNN Training »
Geoffrey Yu · Tovi Grossman · Gennady Pekhimenko -
2020 Oral: BPPSA: Scaling Back-propagation by Parallel Scan Algorithm »
Shang Wang · Yifan Bai · Gennady Pekhimenko