Skip to yearly menu bar Skip to main content


Systems for ML 2

Exhibit Hall A

Moderator: Tushar Krishna


Chat is not available.

Tue 30 Aug. 8:45 - 9:03 PDT

QuClassi: A Hybrid Deep Neural Network Architecture based on Quantum State Fidelity

Samuel A. Stein · Betis Baheri · Daniel Chen · Ying Mao · Qiang Guan · Ang Li · Shuai Xu · Caiwen Ding

In the past decade, remarkable progress has been achieved in deep learning related systems and applications. In the post Moore’s Law era, however, the limit of semiconductor fabrication technology along with the increasing data size has slowed down the development of learning algorithms. In parallel, the rapid development of quantum computing has pushed it into a new era. Google illustrated quantum supremacy by completing a specific task (random sampling problem), in 200 seconds, which continues to be impracticable for the largest classical computers. Due to the exponential potential of quantum computing, quantum based learning is an area of interest, in hopes that certain systems might offer a quantum speedup. In this work, we propose a novel architecture QuClassi, a quantum neural network for both binary and multi-class classification. Powered by a quantum differentiation function along with a hybrid quantum-classic design, QuClassi encodes the data with a reduced number of qubits and generates the quantum circuit, pushing it to the quantum platform for the best states, iteratively. We conduct intensive experiments on both quantum simulators, IBM-Q’s quantum platform as well as evaluate performance on IonQ. The evaluation results demonstrate that QuClassi is able to outperform the state-of-the-art quantum-based solutions, Tensorflow-Quantum and QuantumFlow by up to 53.75% and 203.00% for binary and multi-class classifications. When comparing to traditional deep neural networks, QuClassi achieves a comparable performance with 97.37% fewer parameters.

Tue 30 Aug. 9:03 - 9:21 PDT

VirtualFlow: Decoupling Deep Learning Models from the Underlying Hardware

Andrew Or · Haoyu Zhang · Michael None Freedman

We propose VirtualFlow, a system leveraging a novel abstraction called virtual node processing to decouple the model from the hardware. In each step of training or inference, the batch of input data is split across virtual nodes instead of hardware accelerators (e.g., GPUs and TPUs). Mapping multiple virtual nodes to each accelerator and processing them sequentially effectively time slices the batch, thereby allowing users to reduce the memory requirements of their workloads and mimic large batch sizes on small clusters. Using this technique, VirtualFlow enables many new use cases, such as reproducing training results across different hardware, resource elasticity, and heterogeneous training. In our evaluation, our implementation of VirtualFlow for TensorFlow achieved strong convergence guarantees across different hardware with out-of-the-box hyperparameters, up to 48% lower job completion times with resource elasticity, and up to 42% higher throughput with heterogeneous training.

Tue 30 Aug. 9:21 - 9:39 PDT

TAGLETS: A System for Automatic Semi-Supervised Learning with Auxiliary Data

Wasu Piriyakulkij · Cristina Menghini · Ross Briden · Nihal Vivekanand Nayak · Jeffrey Zhu · Elaheh Raisi · Stephen Bach

Machine learning practitioners often have access to a spectrum of data: labeled data for the target task (which is often limited), unlabeled data, and auxiliary data, the many available labeled datasets for other tasks. We describe TAGLETS, a system built to study techniques for automatically exploiting all three types of data and creating high-quality, servable classifiers. The key components of TAGLETS are: (1) auxiliary data organized according to a knowledge graph, (2) modules encapsulating different methods for exploiting auxiliary and unlabeled data, and (3) a distillation stage in which the ensembled modules are combined into a servable model. We compare TAGLETS with state-of-the-art transfer learning and semi-supervised learning methods on four image classification tasks. Our study covers a range of settings, varying the amount of labeled data and the semantic relatedness of the auxiliary data to the target task. We find that the intelligent incorporation of auxiliary and unlabeled data into multiple learning techniques enables TAGLETS to match---and most often significantly surpass---these alternatives. TAGLETS is available as an open-source system at

Tue 30 Aug. 9:39 - 9:57 PDT

mmSampler: Efficient Frame Sampler for Multimodal Video Retrieval

Zhiming Hu · Ning Ye · Iqbal Mohomed

We study the problem of natural language-based video retrieval, the task of finding relevant videos given natural language search queries. Most recent state-of-the-art (SOTA) approaches would embed the video and query separately and map the video and query embeddings into a joint latent space to calculate a similarity score between them. To learn a video representation, existing solutions generally use all the frames or sample a subset of frames from the video using uniform sampling. The former solution could be computationally prohibitive while the latter may inject noise from uninformative frames into the final video representation. To this end, we propose mmSampler, a learning-based sampler, to adaptively select salient frames to represent the videos for multimodal video retrieval. mmSampler can greatly reduce the computational overhead for video representation without affecting the retrieval performance. We learn a lightweight policy network to decide whether to further process or discard a frame. By adopting the Gumbel-Softmax trick, we train the sampler jointly with the video retrieval model end-to-end in an efficient manner. Experimental results on benchmark datasets such as ActivityNet, DiDeMo and MSRVTT demonstrate that mmSampler achieves improved retrieval performance while saving as much as 43% GFLOPs per video.

Tue 30 Aug. 9:57 - 10:15 PDT

Sustainable AI: Environmental Implications, Challenges and Opportunities

Carole-Jean Wu · Ramya Raghavendra · Udit Gupta · Bilge Acun · Newsha Ardalani · Kiwan Maeng · Gloria Chang · Fiona Aga · Jinshi Huang · Charles Bai · Michael Gschwind · Anurag Gupta · Myle Ott · Anastasia Melnikov · Salvatore Candido · David Brooks · Geeta Chauhan · Benjamin Lee · Hsien-Hsin Lee · Bugra Akyildiz · Maximilian Balandat · Joe Spisak · Ravi Jain · Mike Rabbat · Kim Hazelwood

This paper explores the environmental impact of the super-linear growth trends for AI from a holistic perspective, spanning Data, Algorithms, and System Hardware. We characterize the carbon footprint of AI computing by examining the model development cycle across industry-scale machine learning use cases and, at the same time, considering the life cycle of system hardware. Taking a step further, we capture the operational and manufacturing carbon footprint of AI computing and present an end-to-end analysis for what and how hardware-software design and at-scale optimization can help reduce the overall carbon footprint of AI. Based on the industry experience and lessons learned, we share the key challenges and chart out important development directions across the many dimensions of AI. We hope the key messages and insights presented in this paper can inspire the community to advance the field of AI in an environmentally-responsible manner.