Show Detail |
Timezone: America/Los_Angeles |
Filter Rooms:
MON 5 APR
8:30 a.m.
TUE 6 APR
8 a.m.
8:20 a.m.
9:30 a.m.
Oral
s
9:30-10:50
[9:30]
ModularNAS: Towards Modularized and Reusable Neural Architecture Search
[9:50]
Fluid: Resource-aware Hyperparameter Tuning Engine
[10:10]
MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers
[10:30]
Characterizing and Taming Model Instability Across Edge Devices
(ends 10:50 AM)
11:10 a.m.
Oral
s
11:10-12:30
[11:10]
Cortex: A Compiler for Recursive Deep Learning Models
[11:30]
A Deep Learning Based Cost Model for Automatic Code Optimization
[11:50]
Learning Fitness Functions for Machine Programming
[12:10]
CODE: Compiler-based Neuron-aware Ensemble training
(ends 12:30 PM)
1:30 p.m.
Oral
s
1:30-2:50
[1:30]
Pufferfish: Communication-efficient Models At No Extra Cost
[1:50]
In-network Aggregation for Shared Machine Learning Clusters
[2:10]
Data Movement Is All You Need: A Case Study on Optimizing Transformers
[2:30]
Learning on Distributed Traces for Data Center Storage Systems
(ends 2:50 PM)
3:20 p.m.
Oral
s
3:20-5:00
[3:20]
TensorFlow Lite Micro: Embedded Machine Learning for TinyML Systems
[3:40]
Scaling Distributed Training with Adaptive Summation
[4:00]
PipeMare: Asynchronous Pipeline Parallel DNN Training
[4:20]
EXPLORING THE LIMITS OF CONCURRENCY IN ML TRAINING ON GOOGLE TPUS
[4:40]
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
(ends 5:00 PM)
5 p.m.
(ends 6:00)
WED 7 APR
8 a.m.
9:10 a.m.
Oral
s
9:10-10:50
[9:10]
An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems
[9:30]
Adaptive Gradient Communication via Critical Learning Regime Identification
[9:50]
Don't Forget to Sign the Gradients!
[10:10]
Rethinking Floating Point Overheads for Mixed Precision DNN Accelerators
[10:30]
Bit Error Robustness for Energy-Efficient DNN Accelerators
(ends 10:50 AM)
10:50 a.m.
Break - Visit the
Sponsor Hall
11:10 a.m.
Oral
s
11:10-12:30
[11:10]
RL-Scope: Cross-stack Profiling for Deep Reinforcement Learning Workloads
[11:30]
A Learned Performance Model for Tensor Processing Units
[11:50]
Accounting for Variance in Machine Learning Benchmarks
[12:10]
Larq Compute Engine: Design, Benchmark and Deploy State-of-the-Art Binarized Neural Networks
(ends 12:30 PM)
12:30 p.m.
Lunch Break / Visit the
Sponsor Hall
1:30 p.m.
Oral
s
1:30-2:50
[1:30]
IOS: Inter-Operator Scheduler for CNN Acceleration
[1:50]
Value Learning for Throughput Optimization of Deep Learning Workloads
[2:10]
ByzShield: An Efficient and Robust System for Distributed Training
[2:30]
FirePlace: Placing Firecraker Virtual Machines with Hindsight Imitation
(ends 2:50 PM)
2:50 p.m.
Break - Visit the
Sponsor Hall
3:20 p.m.
Oral
s
3:20-5:00
[3:20]
Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference
[3:40]
MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions
[4:00]
VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference
[4:20]
Accelerate Inference of CNNs for Video Analysis While Preserving Exactness Exploiting Activation Sparsity
[4:40]
sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data
(ends 5:00 PM)
5 p.m.
(ends 6:00)
THU 8 APR
8 a.m.
Invited Talk:
Kathy Yelick
(ends 8:50 AM)
8:50 a.m.
Break - Visit the
Sponsor Hall
9:10 a.m.
Oral
s
9:10-10:50
[9:10]
Boveda: Building an On-Chip Deep Learning Memory Hierarchy Brick by Brick
[9:30]
Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models
[9:50]
A Distributed Graph-Theoretic Framework for Automatic Parallelization in Multi-core Systems
[10:10]
Accelerating SLIDE Deep Learning on Modern CPUs: Vectorization, Quantizations, Memory Optimizations, and More
[10:30]
Scaling Polyhedral Neural Network Verification on GPUs
(ends 10:50 AM)
10:50 a.m.
Break - Visit the
Sponsor Hall
11:10 a.m.
Oral
s
11:10-12:30
[11:10]
SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier Detection
[11:30]
Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy
[11:50]
Equality Saturation for Tensor Graph Superoptimization
[12:10]
Doping: A technique for Extreme Compression of LSTM Models using Sparse Structured Additive Matrices
(ends 12:30 PM)
12:30 p.m.
Lunch Break / Visit the
Sponsor Hall
1:30 p.m.
Oral
s
1:30-2:50
[1:30]
Swift for TensorFlow: A portable, flexible platform for deep learning
[1:50]
Amazon SageMaker Debugger: A System for Real-Time Insights into Machine Learning Model Training
[2:10]
FLAML: A Fast and Lightweight AutoML Library
[2:30]
To Bridge Neural Network Design and Real-World Performance: A Behaviour Study for Neural Networks
(ends 2:50 PM)
2:50 p.m.
Break - Visit the
Sponsor Hall
3:20 p.m.
Oral
s
3:20-4:40
[3:20]
Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters
[3:40]
Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery
[4:00]
Wavelet: Efficient DNN Training with Tick-Tock Scheduling
[4:20]
Pipelined Backpropagation at Scale: Training Large Models without Batches
(ends 4:40 PM)
4:40 p.m.
FRI 9 APR
6:15 a.m.
7 a.m.
7:45 a.m.
8 a.m.
Workshop:
(ends 3:00 PM)