MLSys 2020 Accepted Papers 34

If you are an author on a paper here and your institution is missing, you should immediately update your CMT profile and the corresponding profile at

Please read this guide on how to design for color blindness for tips to make your poster friendlier to the color-blind.

MNN: A Universal and Efficient Inference Engine
Xiaotang Jiang (Alibaba) · Huan Wang (Northeastern University) · Yiliu Chen (Alibaba Group) · Ziqi Wu (Alibaba Group) · Lichuan Wang (Alibaba Group) · Bin Zou (alibaba) · Yafeng Yang (Alibaba Group) · Zongyang Cui (Alibaba Group) · Yu Cai (Alibaba Group) · Tianhang Yu (Alibaba Group) · Chengfei Lyu (Alibaba Group) · Zhihua Wu (Alibaba)

Searching for Winograd-aware Quantized Networks
Javier Fernandez-Marques (University of Oxford) · Paul Whatmough (Arm ML Research Lab) · Andrew Mundy (Arm ML Research Lab) · Matthew Mattina (Arm ML Research Lab)

A Systematic Methodology for Analysis of Deep Learning Hardware and Software Platforms
Yu Wang (Harvard University) · Gu-Yeon Wei (Harvard University) · David Brooks (Harvard University)

Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge Devices
Byung Hoon Ahn (UC San Diego) · Jinwon Lee (Qualcomm AI Research) · Jamie Menjay Lin (Qualcomm AI Research) · Hsin-Pai Cheng (Duke University) · Jilei Hou (Qualcomm AI Research) · Hadi Esmaeilzadeh (University of California, San Diego)

Sense & Sensitivities: The Path to General-Purpose Algorithmic Differentiation
Mike Innes (Julia Computing)

AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning
Ameer Haj-Ali (UC Berkeley) · Qijing (Jenny) Huang (Berkeley) · John Xiang (UC Berkeley) · William Moses (MIT) · Krste Asanovic (UC Berkeley) · John Wawrzynek (UC Berkeley) · Ion Stoica (UC Berkeley)

PLink: Discovering and Exploiting Locality for Accelerated Distributed Training on the public Cloud
Liang Luo (University of Washington) · Peter West (University of Washington) · Jacob Nelson (Microsoft Research) · Arvind Krishnamurthy (University of Washington) · Luis Ceze (University of Washington and OctoML)

Fine-Grained GPU Sharing Primitives for Deep Learning Applications
Peifeng Yu (University of Michigan) · Mosharaf Chowdhury (University of Michigan, Ann Arbor)

Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks
Sambhav R. Jain (Xilinx / Stanford) · Albert Gural (Stanford University) · Michael Wu (Xilinx, Inc.) · Chris Dick (Xilinx, Inc.)

What is the State of Neural Network Pruning?
Davis Blalock (MIT) · Jose Javier Gonzalez Ortiz (MIT) · Jonathan Frankle (MIT) · John Guttag (MIT)

Willump: A Statistically-Aware End-to-end Optimizer for Machine Learning Inference
Peter Kraft (Stanford University) · Daniel Kang (Stanford University) · Deepak Narayanan (Stanford) · Shoumik Palkar (Stanford) · Peter Bailis (Stanford University) · Matei Zaharia (Stanford and Databricks)

PoET-BiN: Power Efficient Tiny Binary Neurons
Sivakumar Chidambaram (Polytechnique Montreal) · Pierre Langlois (Polytechnique Mointreal) · Jean-Pierre David (Polytechnique Montreal)

Blink: Fast and Generic Collectives for Distributed ML
Guanhua Wang (UC Berkeley) · Shivaram Venkataraman (University of Wisconsin, Madison) · Amar Phanishayee (Microsoft Research) · Nikhil Devanur (Microsoft) · Jorgen Thelin (Microsoft Research) · Ion Stoica (UC Berkeley)

Improving the Accuracy, Scalability, and Performance of Graph Neural Networks with Roc
Zhihao Jia (Stanford University) · Sina Lin (Microsoft) · Mingyu Gao (Tsinghua University) · Matei Zaharia (Stanford and Databricks) · Alex Aiken (Stanford University)

MotherNets: Rapid Deep Ensemble Learning
Abdul Wasay (Harvard University) · Brian Hentschel (Harvard University) · Yuze Liao (Harvard University) · Sanyuan Chen (Harvard) · Stratos Idreos (Harvard)

SkyNet: a Hardware-Efficient Method for Object Detection and Tracking on Embedded Systems
Xiaofan Zhang (University of Illinois at Urbana and Champaign) · Haoming Lu (University of Illinois at Urbana and Champaign) · Cong Hao (University of Illinois at Urbana-Champaign) · Jiachen Li (UIUC) · Bowen Cheng (UIUC) · Yuhong Li (University of Illinois at Urbana and Champaign) · Kyle Rupnow (Inspirit IoT, Inc.) · Jinjun Xiong (IBM Thomas J. Watson Research Center) · Thomas Huang (UIUC) · Honghui Shi (IBM | UIUC | Oregon) · Wen-Mei Hwu (University of Illinois at Urbana-Champaign) · Deming Chen (University of Illinois at Urbana-Champaign)

A System for Massively Parallel Hyperparameter Tuning
Liam Li (Carnegie Mellon University) · Kevin Jamieson (U Washington) · Afshin Rostamizadeh (Google Research) · Ekaterina Gonina (Google) · Jonathan Ben-tzur (Determined AI) · Moritz Hardt (UC Berkeley) · Benjamin Recht (UC Berkeley) · Ameet Talwalkar (CMU)

FLEET: Flexible Efficient Ensemble Training for Heterogeneous Deep Neural Networks
Hui Guan (North Carolina State University) · Laxmikant Kishor Mokadam (North Carolina State University) · Xipeng Shen (North Carolina State University) · Seung-Hwan Lim (Oak Ridge National Laboratory) · Robert Patton (Rensselaer Polytechnic Institute, Oak Ridge National Laboratory)

Understanding the Downstream Instability of Word Embeddings
Megan Leszczynski (Stanford University) · Avner May (Stanford University) · Jian Zhang (Stanford University) · Sen Wu (Stanford University) · Christopher Aberger (SambaNova Systems and Stanford University) · Christopher Re (Stanford University)

SLIDE : Training Deep Neural Networks with Large Outputs on a CPU faster than a V100-GPU
Beidi Chen (Rice University) · Tharun Medini (Rice University) · James Farwell (Intel Corporation) · sameh gobriel () · Charlie Tai (Intel Corporation) · Anshumali Shrivastava (Rice University)

Attention-based Learning for Missing Data Imputation in HoloClean
Richard Wu (University of Waterloo) · Aoqian Zhang (University of Waterloo) · Ihab Ilyas (U. of Waterloo) · Theodoros Rekatsinas (University of Wisconsin-Madison)

Memory-Driven Mixed Low Precision Quantization for Enabling Deep Network Inference on Microcontrollers
Manuele Rusci (Universit√† di Bologna) · Alessandro Capotondi (Universit√† di Modena e Reggio Emilia) · Luca Benini (ETHZ)

MLPerf Training Benchmark
Peter Mattson (Google) · Christine Cheng (Intel) · Gregory Diamos (Baidu) · Cody Coleman (Stanford) · Paulius Micikevicius (NVIDIA) · David Patterson (Google) · Hanlin Tang (Intel Corporation) · Gu-Yeon Wei () · Peter Bailis (Stanford University) · Victor Bittorf (Google) · David Brooks (Harvard University) · Dehao Chen (Google) · Debo Dutta (Cisco Systems, Inc.) · Udit Gupta (Harvard University) · Kim Hazelwood (Facebook AI) · Andy Hock (Cerebras Systems) · Xinyuan Huang (Cisco Systems, Inc.) · Daniel Kang (Stanford University) · David Kanter (RWI) · Naveen Kumar (Google) · Jeffery Liao (Synopsys) · Deepak Narayanan (Stanford) · Tayo Oguntebi (Google LLC) · Gennady Pekhimenko (University of Toronto) · Lillian Pentecost (Harvard University) · Vijay Janapa Reddi (Harvard University) · Taylor Robie (Google) · Tom St John (Tesla) · Carole-Jean Wu (Facebook AI) · Lingjie Xu (Alibaba) · Cliff Young ( · Matei Zaharia (Stanford and Databricks)

Privacy-Preserving Bandits
Mohammad Malekzadeh (Queen Mary University of London) · Dimitrios Athanasakis (Brave Software) · Hamed Haddadi (Brave Software) · Ben Livshits (Brave Software)

OPTIMUS: OPTImized matrix MUltiplication Structure for Transformer neural network accelerator
Junki Park (POSTECH) · Hyunsung Yoon (POSTECH) · Daehyun Ahn (POSTECH) · Jungwook Choi (Hanyang University) · Jae-Joon Kim (POSTECH)

Riptide: Fast End-to-End Binarized Neural Networks
Joshua Fromm (University of Washington) · Meghan Cowan (University of Washington) · Matthai Philipose (Microsoft Research) · Luis Ceze (University of Washington and OctoML) · Shwetak Patel (University of Washington)

Automatically batching control-intensive programs for modern accelerators
Alexey Radul (Google) · Brian Patton (Google Inc.) · Dougal Maclaurin (Google Inc.) · Matthew Hoffman (Google) · Rif A. Saurous (Google)

Resource Elasticity in Distributed Deep Learning
Andrew Or (Princeton University) · Haoyu Zhang (Google AI) · Michael Freedman (Princeton University)

Distributed Hierarchical GPU Parameter Server for Massive Scale Deep Learning Ads Systems
Weijie Zhao (Baidu Research) · Deping Xie (Baidu) · Ronglai Jia (Baidu) · Yulei Qian (Baidu) · Ruiquan Ding (Baidu) · Mingming Sun (Baidu Research) · Ping Li (Baidu Research)

Federated Optimization in Heterogeneous Networks
Tian Li (Carnegie Mellon University) · Anit Kumar Sahu (Bosch Center for Artificial Intelligence) · Manzil Zaheer (Google) · Maziar Sanjabi (USC) · Ameet Talwalkar (CMU) · Virginia Smith (Carnegie Mellon University)

BPPSA: Scaling Back-propagation by Parallel Scan Algorithm
Shang Wang (University of Toronto) · Yifan Bai (University of California, Berkeley) · Gennady Pekhimenko (University of Toronto)

Predictive Precompute with Recurrent Neural Networks
Hanson Wang (Facebook) · Zehui Wang (Facebook) · Yuanyuan Ma (Facebook)

Model Assertions for Monitoring and Improving ML Models
Daniel Kang (Stanford University) · Deepti Raghavan (Stanford University) · Peter Bailis (Stanford University) · Matei Zaharia (Stanford and Databricks)

Breaking the Memory Wall with Optimal Tensor Rematerialization
Paras Jain (UC Berkeley) · Ajay Jain (UC Berkeley) · Aniruddha Nrusimha (UC Berkeley) · Amir Gholami (UC Berkeley) · Pieter Abbeel (UC Berkeley) · Joseph Gonzalez (UC Berkeley) · Kurt Keutzer (EECS, UC Berkeley) · Ion Stoica (UC Berkeley)