Timezone: »

 
Workshop
Resource-Constrained Machine Learning (ReCoML 2020)
Yaniv Ben Itzhak · Nina Narodytska · Christopher Aberger

Wed Mar 04 07:00 AM -- 03:30 PM (PST) @ Level 3 Room 8
Event URL: https://sites.google.com/view/recomlsys2020/home »

* Notice that you need to register for the conference to be able to attend the workshop *

The workshop will cover broad aspects that are related to ML over resource-constrained environments, such as Internet-of-Things (IoT) devices, and edge-computing. Resource-constrained ML is challenging due to several reasons: First, current ML models usually have high resource requirements in terms of CPU, memory and I/O. Naive solutions that reduce these resource consumption would result in significant ML performance degradation. Therefore, new ML models and frameworks are required in order to employ ML with reasonable ML performance over resource-constrained environments. Second, resource-constrained environments, such as edge computing and IoT, are usually being used for real-time applications. Hence, the model serving is a critical issue, such that an ML model should respond quickly and accurately while being employed over limited resources. The workshop will specifically include the following topics: model/hardware architectures, models compression, interpretability, use-cases.

The organizers will select papers based on a combination of novelty, quality, interest, and impact.
Topics of interest include, but are not limited to:

- Compression of deep ML model architectures
- Quantized and low-precision neural networks
- Optimization of ML model architectures for resource-constrained environments
- Hardware accelerators for deep ML models
- Explainability of ML models in the context of resource-constrained environments
- ML deployments over resource-constrained environments, e.g. Internet-of-Things (IoT) devices and edge-computing.

Reviewing process: All submissions should include the author’s names and their affiliations. The authors are allowed to post their paper on arXiv or other public forums.

Key dates related to the reviewing process are given below:
Paper submission deadline: January 15, 2020 AoE (at midnight anywhere on earth)
Decision notification: January 27, 2020

We invite research contributions in different formats:
Original research papers (up to 6 pages, not including references)
Position, opinion papers and extended abstracts (up to 4 pages, not including references)

Submission link: link

Dual submission policy: We will not accept any paper which, at the time of submission, is under review for another workshop or has already been published. This policy also applies to papers that overlap substantially in technical content with conference papers under review or previously published.
Proceedings: Accepted papers will be published in the form of online proceedings.
Submission format: To prepare your submission to ReCoML 2020, please use the LaTeX style files provided at SML2020style.tar.gz . Submitted papers will be in a 2-column format, each reference must explicitly list all the authors of the paper.

Organizing Committee
Yaniv Ben-Itzhak, VMware Research, ybenitzhak (at) vmware (dot) com
Nina Narodytska, VMware Research, nnarodytska (at) vmware (dot) com
Christopher R. Aberger, Stanford and SambaNova Systems, christopher.aberger (at) sambanovasystems (dot) ai

Wed 8:25 a.m. - 8:30 a.m.
Welcome (Opening notes)
Wed 8:30 a.m. - 9:15 a.m.

With recent advances in machine learning, large enterprises incorporate machine learning models across a number of products. To facilitate training of these models, enterprises use shared, multi-tenant cluster of machines equipped with accelerators like GPUs. Similar to data analytics clusters, operators aim to achieve high resource utilization while providing resource isolation and fair sharing across users. In this talk we will first present characterization of machine learning workloads from a multi-tenant GPU cluster at Microsoft. We then present how various aspects of these workloads such as gang scheduling and locality constraints affect resource utilization and efficiency. Based on this analysis we discuss research efforts to improve efficiency and utilization both for individual jobs and across the cluster.

Wed 9:15 a.m. - 9:30 a.m.
QuaRL: Quantized Reinforcement Learning (Presentation)
Wed 9:30 a.m. - 9:45 a.m.
Optimizing Sparse Matrix Operations for Deep Learning (Presentation)
Wed 9:45 a.m. - 10:00 a.m.
Energy-Aware DNN Graph Optimization (Presentation)
Wed 10:00 a.m. - 12:00 p.m.
Lunch (Break)
Wed 12:00 p.m. - 12:45 p.m.

Much of the recent advancement in machine learning has been driven by the capability of machine learning systems to process and learn from very large data sets using very complicated models. Continuing to scale data up in this way presents a computational challenge, as power, memory, and time are all factors that limit performance. One popular approach to address these issues is low-precision arithmetic in which a lower-precision number is used to improve these systems metrics—although possibly at the cost of some accuracy. In this talk, I will discuss some recent methods from my lab that use numerical precision for ML tasks, while the same time trying to understand its effects theoretically.

Wed 12:45 p.m. - 1:00 p.m.
Efficient Memory Management for Deep Neural Net Inference (Presentation)
Wed 1:00 p.m. - 1:15 p.m.
Once for All: Train One Network and Specialize it for Efficient Deployment (Presentation)
Wed 1:15 p.m. - 1:30 p.m.
GReTA: Hardware Optimized Graph Processing For GNNs (Presentation)
Wed 1:30 p.m. - 2:00 p.m.
Afternoon Break (Break)
Wed 2:00 p.m. - 2:15 p.m.
Optimizing JPEG Quantization for Classification Networks (Presentation)
Wed 2:15 p.m. - 2:30 p.m.
Conditional Neural Architecture Search (Presentation)
Wed 2:30 p.m. - 2:45 p.m.
Transfer Learning with Fine-grained Sparse Networks: From Efficient Network Perspective (Presentation)
Wed 2:45 p.m. - 3:00 p.m.
Closing remarks (End)

Author Information

Yaniv Ben Itzhak (VMware)
Nina Narodytska (VMWare)
Christopher Aberger (SambaNova Systems and Stanford University)

More from the Same Authors