Systems for ML and ML for Systems: A Virtuous Cycle
This talk is about the virtuous interplay between machine learning (ML) and systems. I will show examples of how systems optimized for ML computation can be used to train more accurate and capable ML models and how these ML models can be used to improve upon the ad-hoc heuristics used in system design and management. These improved systems can then be used to train better ML models. The latest trend in ML is the development of Foundation models. Foundation models are large pretrained models that have obtained state-of-the-art quality in natural language processing, vision, speech, and other areas. These models are challenging to train and serve because they are characterized by billions of parameters, irregular data access (sparsity) and irregular control flow. I will explain how Reconfigurable Dataflow Accelerators (RDAs) can be designed to accelerate foundation models with these characteristics. SambaNova Systems is using RDA technology to achieve record-setting performance on foundation models. I will describe how the RDAs can also be used to build Taurus, an intelligent network data plane that enables ML models to be used to manage computer networks at full line-rate bandwidths. In particular, a Taurus prototype detects two orders of magnitude more events in a security application than a state-of-the-art system based on conventional network technology.
The Future of MLSys: Industry and Academia - panel
Are you a budding researcher or engineer interested in MLSys or just curious about opportunities across industry and academia? Join our panel, “The Future of MLSys: Industry and Academia”, where representatives from industry and academia share their experiences in machine learning and systems. The panel will cover topics such as differences between MLSys research in industry vs. academia, the role of industry and academia in the field more broadly, and how you can get involved!
Signup here
Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration
Please register online at this link if you would like to attend: https://forms.gle/Pd9tviuBHBno7G7F7
We present a tutorial that teaches users how to perform full-system, full-stack DNN accelerator evaluation using the Gemmini platform. Gemmini allows users to evaluate how a DNN hardware accelerator interacts with external components, like the cache hierarchy or virtual address translation scheme, to affect performance across the hardware-software-system stack.
With Gemmini, users can generate a variety of different DNN hardware accelerators, with different underlying system, SoC, and programming stack components. Users can evaluate the performance of their hardware accelerators on end-to-end workloads in a real-world system context, exposing how different system components, like the cache hierarchy, virtual address translation scheme, or operating system, impact performance in subtle but noticeable ways. Gemmini also allows users to program their applications at different “levels” of the programming stack, from high-level model compilation to low-level direct machine configuration. Overall, Gemmini enables users to explore and evaluate a variety of different DNN accelerator and system configurations, exposing how these different parameters interact to impact end-to-end performance and efficiency.
Gemmini has been presented previously at DAC 2021, where it won the Best Paper award, as
well as at an IISWC 2021 tutorial.
This tutorial presents a design and implementation of a scikit-compatible system for detecting anomalies from time series data for the purpose of offering a broad range of algorithms to the end user, with special focus on unsupervised/semi-supervised learning. Given an input time series, we discuss how data scientist can construct four categories of anomaly pipelines followed by an enrichment module that helps to label anomaly. The tutorial provides an hand-on-experience using a deployed system on IBM API Hub for developer communities that aim to support a wide range of execution engines to meet the diverse need of anomaly workloads such as Serveless for CPU intensive work, GPU for deep-learning model training, etc.
The Opening Reception will be held on Monday evening in the Mission City Ballroom at 5:30pm.