61 Results

Remarks
Tue 8:00 Opening Remarks
Alex Dimakis, Ion Stoica, Alexander Smola
Invited Talk
Tue 8:20 Directions for Deep Learning Hardware
William Dally
Oral
Tue 11:10 Cortex: A Compiler for Recursive Deep Learning Models
Pratik Fegade, Tianqi Chen, Phillip Gibbons, Todd Mowry
Oral
Tue 11:30 A Deep Learning Based Cost Model for Automatic Code Optimization
Riyadh Baghdadi, Massinissa Merouani, Mohamed-Hicham LEGHETTAS, Kamel Abdous, Taha Arbaoui, Karima BENATCHBA, Saman Amarasinghe
Oral
Tue 13:30 Pufferfish: Communication-efficient Models At No Extra Cost
Hongyi Wang, Saurabh Agarwal, Dimitrios Papailiopoulos
Oral
Tue 13:50 In-network Aggregation for Shared Machine Learning Clusters
Nadeen Gebara, Manya Ghobadi, Paolo Costa
Oral
Tue 14:10 Data Movement Is All You Need: A Case Study on Optimizing Transformers
Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Shigang Li, Torsten Hoefler
Oral
Tue 14:30 Learning on Distributed Traces for Data Center Storage Systems
Giulio Zhou, Martin Maas
Oral
Tue 16:40 TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
Chunxing Yin, Bilge Acun, Carole-Jean Wu, Xing Liu
Poster
Tue 17:00 TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
Chunxing Yin, Bilge Acun, Carole-Jean Wu, Xing Liu
Poster
Tue 17:00 Cortex: A Compiler for Recursive Deep Learning Models
Pratik Fegade, Tianqi Chen, Phillip Gibbons, Todd Mowry
Poster
Tue 17:00 Learning on Distributed Traces for Data Center Storage Systems
Giulio Zhou, Martin Maas
Poster
Tue 17:00 In-network Aggregation for Shared Machine Learning Clusters
Nadeen Gebara, Manya Ghobadi, Paolo Costa
Poster
Tue 17:00 Pufferfish: Communication-efficient Models At No Extra Cost
Hongyi Wang, Saurabh Agarwal, Dimitrios Papailiopoulos
Poster
Tue 17:00 Data Movement Is All You Need: A Case Study on Optimizing Transformers
Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Shigang Li, Torsten Hoefler
Poster
Tue 17:00 A Deep Learning Based Cost Model for Automatic Code Optimization
Riyadh Baghdadi, Massinissa Merouani, Mohamed-Hicham LEGHETTAS, Kamel Abdous, Taha Arbaoui, Karima BENATCHBA, Saman Amarasinghe
Oral
Wed 9:50 Don't Forget to Sign the Gradients!
Omid Aramoon, Pin-Yu Chen, Gang Qu
Oral
Wed 11:30 A Learned Performance Model for Tensor Processing Units
Sam Kaufman, Mangpo Phothilimthana, Yanqi Zhou, Charith Mendis, Sudip Roy, Amit Sabne, Mike Burrows
Oral
Wed 13:30 IOS: Inter-Operator Scheduler for CNN Acceleration
Yaoyao Ding, Ligeng Zhu, Zhihao Jia, Gennady Pekhimenko, Song Han
Oral
Wed 13:50 Value Learning for Throughput Optimization of Deep Learning Workloads
Benoit Steiner, Chris Cummins, Horace He, Hugh Leather
Oral
Wed 15:20 Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference
Haichen Shen, Jared Roesch, Zhi Chen, wweic Chen, Yong Wu, Mu Li, Vin Sharma, Zachary Tatlock, Yida Wang
Oral
Wed 16:00 VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference
Steve Dai, Rangha Venkatesan, Mark Ren, Brian Zimmer, William Dally, Brucek Khailany
Poster
Wed 17:00 Value Learning for Throughput Optimization of Deep Learning Workloads
Benoit Steiner, Chris Cummins, Horace He, Hugh Leather
Poster
Wed 17:00 IOS: Inter-Operator Scheduler for CNN Acceleration
Yaoyao Ding, Ligeng Zhu, Zhihao Jia, Gennady Pekhimenko, Song Han
Poster
Wed 17:00 A Learned Performance Model for Tensor Processing Units
Sam Kaufman, Mangpo Phothilimthana, Yanqi Zhou, Charith Mendis, Sudip Roy, Amit Sabne, Mike Burrows
Poster
Wed 17:00 Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference
Haichen Shen, Jared Roesch, Zhi Chen, wweic Chen, Yong Wu, Mu Li, Vin Sharma, Zachary Tatlock, Yida Wang
Poster
Wed 17:00 Don't Forget to Sign the Gradients!
Omid Aramoon, Pin-Yu Chen, Gang Qu
Poster
Wed 17:00 VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference
Steve Dai, Rangha Venkatesan, Mark Ren, Brian Zimmer, William Dally, Brucek Khailany
Oral
Thu 9:10 Boveda: Building an On-Chip Deep Learning Memory Hierarchy Brick by Brick
Isak Edo Vivancos, Sayeh Sharify, Daniel Ly-Ma, Ameer Abdelhadi, Ciaran Bannon, Milos Nikolic, Mostafa Mahmoud, Alberto Delmas Lascorz, Gennady Pekhimenko, Andreas Moshovos
Oral
Thu 9:30 Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models
Shang Wang, Peiming Yang, Yuxuan Zheng, Xin Li, Gennady Pekhimenko
Oral
Thu 10:10 Accelerating SLIDE Deep Learning on Modern CPUs: Vectorization, Quantizations, Memory Optimizations, and More
Shabnam Daghaghi, Nicholas Meisburger, Mengnan Zhao, Anshumali Shrivastava
Oral
Thu 11:30 Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy
Lucas Liebenwein, Cenk Baykal, Brandon Carter, David Gifford, Daniela Rus
Oral
Thu 11:50 Equality Saturation for Tensor Graph Superoptimization
Yichen Yang, Mangpo Phothilimthana, Yisu Wang, Max Willsey, Sudip Roy, Jacques Pienaar
Oral
Thu 13:30 Swift for TensorFlow: A portable, flexible platform for deep learning
Brennan Saeta, Denys Shabalin
Oral
Thu 15:20 Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters
Shaohuai Shi, Xianhao Zhou, Shutao Song, Xingyao Wang, Zilin Zhu, Xue Huang, Xinan Jiang, Feihu Zhou, Zhenyu Guo, Liqiang Xie, Rui Lan, Xianbin Ouyang, Yan Zhang, Jieqian Wei, Jing Gong, Weiliang Lin, Ping Gao, Peng Meng, Xiaomin Xu, Chenyang Guo, Bo Yang, Zhibo Chen, Yongjian Wu, Xiaowen Chu
Oral
Thu 15:40 Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery
Kiwan Maeng, Shivam Bharuka, Isabel Gao, Mark Jeffrey, Vikram Saraph, Bor-Yiing Su, Caroline Trippel, Jiyan Yang, Mike Rabbat, Brandon Lucia, Carole-Jean Wu
Poster
Thu 17:00 Swift for TensorFlow: A portable, flexible platform for deep learning
Brennan Saeta, Denys Shabalin
Poster
Thu 17:00 Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy
Lucas Liebenwein, Cenk Baykal, Brandon Carter, David Gifford, Daniela Rus
Poster
Thu 17:00 Boveda: Building an On-Chip Deep Learning Memory Hierarchy Brick by Brick
Isak Edo Vivancos, Sayeh Sharify, Daniel Ly-Ma, Ameer Abdelhadi, Ciaran Bannon, Milos Nikolic, Mostafa Mahmoud, Alberto Delmas Lascorz, Gennady Pekhimenko, Andreas Moshovos
Poster
Thu 17:00 Accelerating SLIDE Deep Learning on Modern CPUs: Vectorization, Quantizations, Memory Optimizations, and More
Shabnam Daghaghi, Nicholas Meisburger, Mengnan Zhao, Anshumali Shrivastava
Poster
Thu 17:00 Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters
Shaohuai Shi, Xianhao Zhou, Shutao Song, Xingyao Wang, Zilin Zhu, Xue Huang, Xinan Jiang, Feihu Zhou, Zhenyu Guo, Liqiang Xie, Rui Lan, Xianbin Ouyang, Yan Zhang, Jieqian Wei, Jing Gong, Weiliang Lin, Ping Gao, Peng Meng, Xiaomin Xu, Chenyang Guo, Bo Yang, Zhibo Chen, Yongjian Wu, Xiaowen Chu
Poster
Thu 17:00 Equality Saturation for Tensor Graph Superoptimization
Yichen Yang, Mangpo Phothilimthana, Yisu Wang, Max Willsey, Sudip Roy, Jacques Pienaar
Poster
Thu 17:00 Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models
Shang Wang, Peiming Yang, Yuxuan Zheng, Xin Li, Gennady Pekhimenko
Poster
Thu 17:00 Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery
Kiwan Maeng, Shivam Bharuka, Isabel Gao, Mark Jeffrey, Vikram Saraph, Bor-Yiing Su, Caroline Trippel, Jiyan Yang, Mike Rabbat, Brandon Lucia, Carole-Jean Wu
Workshop
Fri 6:15 Personalized Recommendation Systems and Algorithms
Udit Gupta, Carole-Jean Wu, Gu-Yeon Wei, David Brooks
Workshop
Fri 7:15 Putting AI on Diet: TinyML and Efficient Deep Learning (Song Han, MIT)
Workshop
Fri 8:00 Timothy Chou, "Pediatric Cloud: A moon shot project to connect all 1,000,000 healthcare machines in all the children’s hospitals in the world and enable AI/ML to change children’s healthcare"
Tim Chou
Workshop
Fri 8:15 Optimizing Deep Learning Recommender Systems Training on CPU Cluster Architectures
Dhiraj Kalamkar Kalamkar
Workshop
Fri 8:30 Efficient ML on the Edge with Apache TVM (Thierry Moreau, OctoML)
Thierry Moreau
Workshop
Fri 8:30 Main-Memory Acceleration for Bandwidth-Bound Deep Learning Inference
Benjamin Cho, Mattan Erez
Workshop
Fri 9:00 "Designing and Optimizing AI Systems for Deep Learning Recommendation and Beyond" - Carole-Jean Wu (Facebook)
Carole-Jean Wu
Workshop
Fri 9:45 Towards Real-Time 3D Object Detection with Pruning Search on Edge Devices (Pu Zhao, Northeastern University)
PU ZHAO ZHAO
Workshop
Fri 10:30 Poster session
Workshop
Fri 11:00 Revisiting Recommender Systems on the GPU
Even Oldridge Oldridge
Workshop
Fri 12:30 Pushing the Limits of Recommender Training Speed: An MLPerf Experience
Tayo Oguntebi
Workshop
Fri 12:30 Keynote Talk: GNNs for Charged Particle Reconstruction at the Large Hadron Collider by Savannah Thais (Princeton)
Workshop
Fri 13:30 Applying Maximal Coding Rate Reduction to Text classification
Yuxin Liang
Workshop
Fri 13:45 Scalability, Latency, Flexibility: The Case for Similarity Search as a Service
Amir Sadoughi
Workshop
Fri 13:45 Deploying Deep Learning Applications on FPGA: Experiences and Learnings
Ashwin Krishnan, Shagun Sodhani
Workshop
Fri 14:00 Capacity-Driven Scale-Out Neural Recommendation: Enabling the Growing Scale of Recommendation
Mike Lui
Workshop
Fri 14:10 Keynote Talk: ‪Efficient GNNs: How Can Graphs Go From Last To Fast? by Nicholas Lane (Cambridge)