Show Detail |
Timezone: America/Los_Angeles |
Filter Rooms:
SUN 28 AUG
4 p.m.
MON 29 AUG
7 a.m.
8 a.m.
Coffee Break
8:30 a.m.
Opening Remarks
8:45 a.m.
Oral
s
8:45-10:15
[8:45]
Pathways: Asynchronous Distributed Dataflow for ML
[9:03]
QuadraLib: A Performant Quadratic Neural Network Library for Architecture Optimization and Design Exploration
[9:21]
Random Offset Block Embedding (ROBE) for compressed embedding tables in deep learning recommendation systems
[9:39]
ML-EXray: Visibility into ML Deployment on the Edge
[9:57]
GPU Semiring Primitives for Sparse Neighborhood Methods
(ends 9:15 AM)
10:15 a.m.
Break
10:30 a.m.
11:30 a.m.
Lunch Break (Box Lunch)
1 p.m.
Oral
s
1:00-2:12
[1:00]
BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling
[1:18]
Sequential Aggregation and Rematerialization: Distributed Full-batch Training of Graph Neural Networks on Large Graphs
[1:36]
PAPAYA: Practical, Private, and Scalable Federated Learning
[1:54]
LightSecAgg: a Lightweight and Versatile Design for Secure Aggregation in Federated Learning
(ends 2:15 PM)
Tutorial:
(ends 5:00 PM)
2:15 p.m.
Oral
s
2:15-3:27
[2:15]
The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal Padding
[2:33]
Apollo: Automatic Partition-based Operator Fusion through Layer by Layer Optimization
[2:51]
DietCode: Automatic Optimization for Dynamic Tensor Programs
[3:09]
Bit-serial Weight Pools: Compression and Arbitrary Precision Execution of Neural Networks on Resource Constrained Processors
(ends 3:30 PM)
3:30 p.m.
Coffee Break
4 p.m.
Oral
s
4:00-5:30
[4:00]
MLPerf Mobile Inference Benchmark: An Industry-Standard Open-Source Machine Learning Benchmark for On-Device AI
[4:18]
TorchSparse: Efficient Point Cloud Inference Engine
[4:36]
Hydrozoa: Dynamic Hybrid-Parallel DNN Training on Serverless Containers
[4:54]
Plumber: Diagnosing and Removing Performance Bottlenecks in Machine Learning Data Pipelines
[5:12]
Graphiler: Optimizing Graph Neural Networks with Message Passing Data Flow Graph
(ends 5:30 PM)
5:30 p.m.
TUE 30 AUG
6:30 a.m.
7:30 a.m.
Coffee Break
8 a.m.
8:45 a.m.
Oral
s
8:45-10:15
[8:45]
QuClassi: A Hybrid Deep Neural Network Architecture based on Quantum State Fidelity
[9:03]
VirtualFlow: Decoupling Deep Learning Models from the Underlying Hardware
[9:21]
TAGLETS: A System for Automatic Semi-Supervised Learning with Auxiliary Data
[9:39]
mmSampler: Efficient Frame Sampler for Multimodal Video Retrieval
[9:57]
Sustainable AI: Environmental Implications, Challenges and Opportunities
(ends 10:15 AM)
10:15 a.m.
Break
10:30 a.m.
1 p.m.
Tutorial:
(ends 5:00 PM)
Oral
s
1:00-2:12
[1:00]
Matchmaker: Data Drift Mitigation in Machine Learning for Large-Scale Systems
[1:18]
A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules
[1:36]
SLA-Driven ML INFERENCE FRAMEWORK FOR CLOUDS WITH HETEROGENEOUS ACCELERATORS
[1:54]
NURD: Negative-Unlabeled Learning for Online Datacenter Straggler Prediction
(ends 2:15 PM)
2:15 p.m.
Oral
s
2:15-3:27
[2:15]
Collapsible Linear Blocks for Super-Efficient Super Resolution
[2:33]
Towards the Co-design of Neural Networks and Accelerators
[2:51]
On the Utility of Gradient Compression in Distributed Training Systems
[3:09]
Efficient Strong Scaling Through Burst Parallel Training
(ends 3:30 PM)
3:30 p.m.
Coffee Break
4 p.m.
Oral
s
4:00-5:30
[4:00]
Revelio: ML-Generated Debugging Queries for Finding Root Causes in Distributed Systems
[4:18]
Randomness in Neural Network Training: Characterizing the Impact of Tooling
[4:36]
Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning
[4:54]
dPRO: A Generic Performance Diagnosis and Optimization Toolkit for Expediting Distributed DNN Training
[5:12]
A Tale of Two Models: Constructing Evasive Attacks on Edge Models
(ends 5:30 PM)
WED 31 AUG
6:30 a.m.
7:30 a.m.
Coffee Break
8 a.m.
8:45 a.m.
Oral
s
8:45-10:15
[8:45]
REX: Revisiting Budgeted Training with an Improved Schedule
[9:03]
SRIFTY: Swift and Thrifty Distributed Neural Network Training on the Cloud
[9:21]
Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining
[9:39]
Improving Model Training with Multi-fidelity Hyperparameter Evaluation
[9:57]
Gyro Dropout: Maximizing Ensemble Effect in Neural Network Training
(ends 10:15 AM)
10:15 a.m.
Break
10:30 a.m.
11:30 a.m.
Lunch Break (Not Provided)
1 p.m.
Tutorial:
(ends 5:00 PM)
Oral
s
1:00-2:12
[1:00]
Understanding GNN Computational Graph: A Coordinated Computation, IO, and Memory Perspective
[1:18]
torch.fx: Practical Program Capture and Transformation for Deep Learning in Python
[1:36]
FROTE: Feedback Rule-Driven Oversampling for Editing Models
[1:54]
TyXe: Pyro-based Bayesian neural nets for Pytorch
(ends 2:15 PM)
Tutorial:
(ends 5:00 PM)
2:15 p.m.
Oral
s
2:15-4:03
[2:15]
ULPPACK: Fast Sub-8-bit Matrix Multiply on Commodity SIMD Hardware
[2:33]
AccMPEG: Optimizing Video Encoding for Accurate Video Analytics
[2:51]
HALOS: Hashing Large Output Space for Cheap Inference
[3:09]
Learning Compressed Embeddings for On-Device Inference
[3:27]
Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance
[3:45]
URSABench: A System for Comprehensive Benchmarking of Bayesian Deep Neural Network Models and Inference methods
(ends 4:03 PM)
5 p.m.
Closing Remarks
THU 1 SEP
7 a.m.
8 a.m.
10 a.m.
Coffee Break AM
3 p.m.
Coffee Break PM