MLSys 2021 Tuesday 04/6

Timezone: US/Pacific

Full Schedule Tue Wed Thu

Remarks

Opening Remarks

Alex Dimakis ⋅ Ion Stoica ⋅ Alexander Smola

8:00 AM - 8:15 AM

After the opening remarks, please join William Dally's invited talk, Directions for Deep Learning Hardware.

... more

Invited Talk

Directions for Deep Learning Hardware

William Dally

8:20 AM - 9:10 AM

Deep learning has been enabled by powerful hardware and its progress is gated by improvements in hardware performance. This talk will review the current state of deep learning hardware and explore a number of directions to continue performance scaling in the absence of Moore’s Law.. Topics discussed will include number representation, sparsity, memory organization, optimized circuits, and analog computation.

... more

Speaker Bio

Bill is Chief Scientist and Senior Vice President of Research at NVIDIA Corporation and an Adjunct Professor and former chair of Computer Science at Stanford University. Bill is currently working on developing hardware and software to accelerate demanding applications including machine learning, bioinformatics, and logical inference. He has a history of designing innovative and efficient experimental computing systems. While at Bell Labs Bill contributed to the BELLMAC32 microprocessor and designed the MARS hardware accelerator. At Caltech he designed the MOSSIM Simulation Engine and the Torus Routing Chip which pioneered wormhole routing and virtual-channel flow control. At the Massachusetts Institute of Technology his group built the J-Machine and the M-Machine, experimental parallel computer systems that pioneered the separation of mechanisms from programming models and demonstrated very low overhead synchronization and communication mechanisms. At Stanford University his group developed the Imagine processor, which introduced the concepts of stream processing and partitioned register organizations, the Merrimac supercomputer, which led to GPU computing, and the ELM low-power processor. Bill is a Member of the National Academy of Engineering, a Fellow of the IEEE, a Fellow of the ACM, and a Fellow of the American Academy of Arts and Sciences. He has received the ACM Eckert-Mauchly Award, the IEEE Seymour Cray Award, the ACM Maurice Wilkes award, the IEEE-CS Charles Babbage Award, the IPSJ FUNAI Achievement Award, the Caltech Distinguished Alumni Award, and the Stanford Tau-Beta-Pi Teaching Award. He currently leads projects on computer architecture, network architecture, circuit design, and programming systems. He has published over 250 papers in these areas, holds over 160 issued patents, and is an author of the textbooks, Digital Design: A Systems Approach, Digital Systems Engineering, and Principles and Practices of Interconnection Networks.

... more

Oral

Session 1: Search and Devices

9:30 AM - 10:50 AM

4 Events in this session

ModularNAS: Towards Modularized and Reusable Neural Architecture Search

Yunfeng Lin ⋅ Guilin Li ⋅ Xing Zhang ⋅ Weinan Zhang ⋅ Bo Chen ⋅ Ruiming Tang ⋅ Zhenguo Li ⋅ Jiashi Feng ⋅ Yong Yu

Fluid: Resource-aware Hyperparameter Tuning Engine

Peifeng Yu ⋅ Jiachen Liu ⋅ Mosharaf Chowdhury

MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers

Colby Banbury ⋅ Chuteng Zhou ⋅ Igor Fedorov ⋅ Ramon Matas ⋅ Urmish Thakker ⋅ Dibakar Gope ⋅ Vijay Janapa Reddi ⋅ Matthew Mattina ⋅ Paul Whatmough

Characterizing and Taming Model Instability Across Edge Devices

Eyal Cidon ⋅ Evgenya Pergament ⋅ Zain Asgar ⋅ Asaf Cidon ⋅ Sachin Katti

Go to Event Page

Oral

Session 2: Compilers

11:10 AM - 12:30 PM

4 Events in this session

Cortex: A Compiler for Recursive Deep Learning Models

Pratik Fegade ⋅ Tianqi Chen ⋅ Phillip Gibbons ⋅ Todd Mowry

A Deep Learning Based Cost Model for Automatic Code Optimization

Riyadh Baghdadi ⋅ Massinissa Merouani ⋅ Mohamed-Hicham LEGHETTAS ⋅ Kamel Abdous ⋅ Taha Arbaoui ⋅ Karima BENATCHBA ⋅ Saman Amarasinghe

Learning Fitness Functions for Machine Programming

Shantanu Mandal ⋅ Todd Anderson ⋅ Javier Turek ⋅ Justin Gottschlich ⋅ Shengtian Zhou ⋅ Abdullah Muzahid

CODE: Compiler-based Neuron-aware Ensemble training

Ettore M. G. Trainiti ⋅ Thanapon Noraset ⋅ David Demeter ⋅ Doug Downey ⋅ Simone Campanoni

Go to Event Page

Oral

Session 3: Communication and Storage

1:30 PM - 2:50 PM

4 Events in this session

Pufferfish: Communication-efficient Models At No Extra Cost

Hongyi Wang ⋅ Saurabh Agarwal ⋅ Dimitris Papailiopoulos

In-network Aggregation for Shared Machine Learning Clusters

Nadeen Gebara ⋅ Manya Ghobadi ⋅ Paolo Costa

Data Movement Is All You Need: A Case Study on Optimizing Transformers

Andrei Ivanov ⋅ Nikoli Dryden ⋅ Tal Ben-Nun ⋅ Shigang Li ⋅ Torsten Hoefler

Learning on Distributed Traces for Data Center Storage Systems

Giulio Zhou ⋅ Martin Maas

Go to Event Page

Oral

Session 4: Training (I)

3:20 PM - 5:00 PM

5 Events in this session

TensorFlow Lite Micro: Embedded Machine Learning for TinyML Systems

Robert David ⋅ Jared Duke ⋅ Advait Jain ⋅ Vijay Janapa Reddi ⋅ Nat Jeffries ⋅ Jian Li ⋅ Nick Kreeger ⋅ Ian Nappier ⋅ Meghna Natraj ⋅ Tiezhen Wang ⋅ Pete Warden ⋅ Rocky Rhodes ⋅ Rocky Rhodes

Scaling Distributed Training with Adaptive Summation

Saeed Maleki ⋅ Madan Musuvathi ⋅ Todd Mytkowicz ⋅ Olli Saarikivi ⋅ Tianju Xu ⋅ Vadim Eksarevskiy ⋅ Jaliya Ekanayake ⋅ Emad Barsoum

PipeMare: Asynchronous Pipeline Parallel DNN Training

Bowen Yang ⋅ Jian Zhang ⋅ Jonathan Li ⋅ Christopher Re ⋅ Christopher Aberger ⋅ Christopher De Sa

EXPLORING THE LIMITS OF CONCURRENCY IN ML TRAINING ON GOOGLE TPUS

Sameer Kumar ⋅ Yu Wang ⋅ Cliff Young ⋅ James Bradbury ⋅ Naveen Kumar ⋅ Dehao Chen ⋅ Andy Swing

TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models

Chunxing Yin ⋅ Bilge Acun ⋅ Carole-Jean Wu ⋅ Xing Liu

Go to Event Page

Poster

Poster Session 1

5:00 PM -

17 Events in this session

Cortex: A Compiler for Recursive Deep Learning Models

Pratik Fegade ⋅ Tianqi Chen ⋅ Phillip Gibbons ⋅ Todd Mowry

In-network Aggregation for Shared Machine Learning Clusters

Nadeen Gebara ⋅ Manya Ghobadi ⋅ Paolo Costa

MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers

Colby Banbury ⋅ Chuteng Zhou ⋅ Igor Fedorov ⋅ Ramon Matas ⋅ Urmish Thakker ⋅ Dibakar Gope ⋅ Vijay Janapa Reddi ⋅ Matthew Mattina ⋅ Paul Whatmough

TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models

Chunxing Yin ⋅ Bilge Acun ⋅ Carole-Jean Wu ⋅ Xing Liu

A Deep Learning Based Cost Model for Automatic Code Optimization

Riyadh Baghdadi ⋅ Massinissa Merouani ⋅ Mohamed-Hicham LEGHETTAS ⋅ Kamel Abdous ⋅ Taha Arbaoui ⋅ Karima BENATCHBA ⋅ Saman Amarasinghe

Characterizing and Taming Model Instability Across Edge Devices

Eyal Cidon ⋅ Evgenya Pergament ⋅ Zain Asgar ⋅ Asaf Cidon ⋅ Sachin Katti

CODE: Compiler-based Neuron-aware Ensemble training

Ettore M. G. Trainiti ⋅ Thanapon Noraset ⋅ David Demeter ⋅ Doug Downey ⋅ Simone Campanoni

Data Movement Is All You Need: A Case Study on Optimizing Transformers

Andrei Ivanov ⋅ Nikoli Dryden ⋅ Tal Ben-Nun ⋅ Shigang Li ⋅ Torsten Hoefler

EXPLORING THE LIMITS OF CONCURRENCY IN ML TRAINING ON GOOGLE TPUS

Sameer Kumar ⋅ Yu Wang ⋅ Cliff Young ⋅ James Bradbury ⋅ Naveen Kumar ⋅ Dehao Chen ⋅ Andy Swing

Fluid: Resource-aware Hyperparameter Tuning Engine

Peifeng Yu ⋅ Jiachen Liu ⋅ Mosharaf Chowdhury

Learning Fitness Functions for Machine Programming

Shantanu Mandal ⋅ Todd Anderson ⋅ Javier Turek ⋅ Justin Gottschlich ⋅ Shengtian Zhou ⋅ Abdullah Muzahid

Learning on Distributed Traces for Data Center Storage Systems

Giulio Zhou ⋅ Martin Maas

ModularNAS: Towards Modularized and Reusable Neural Architecture Search

Yunfeng Lin ⋅ Guilin Li ⋅ Xing Zhang ⋅ Weinan Zhang ⋅ Bo Chen ⋅ Ruiming Tang ⋅ Zhenguo Li ⋅ Jiashi Feng ⋅ Yong Yu

PipeMare: Asynchronous Pipeline Parallel DNN Training

Bowen Yang ⋅ Jian Zhang ⋅ Jonathan Li ⋅ Christopher Re ⋅ Christopher Aberger ⋅ Christopher De Sa

Pufferfish: Communication-efficient Models At No Extra Cost

Hongyi Wang ⋅ Saurabh Agarwal ⋅ Dimitris Papailiopoulos

Scaling Distributed Training with Adaptive Summation

Saeed Maleki ⋅ Madan Musuvathi ⋅ Todd Mytkowicz ⋅ Olli Saarikivi ⋅ Tianju Xu ⋅ Vadim Eksarevskiy ⋅ Jaliya Ekanayake ⋅ Emad Barsoum

TensorFlow Lite Micro: Embedded Machine Learning for TinyML Systems

Go to Event Page