Summary: We propose a full-day workshop (2 keynote, 3 invited talks, 6 accepted papers, 1 panel, and 1 poster session) at MLSys’22 conference for professionals, researchers, and practitioners who are interested in leveraging artificial intelligence and machine learning to efficiently design and build
cloud computing systems and operating cloud services. This workshop will be guided by the steering committee and the program committee with members from both academia and industry in systems, AI/ML, and software engineering areas.
Thu 9:00 a.m. - 9:10 a.m.
|
Opening Remarks
|
🔗 |
Thu 9:10 a.m. - 10:00 a.m.
|
Keynote: Advances in ML for Systems in Microsoft Azure
(
Keynote
)
Dr. Ricardo Bianchini is a Distinguished Engineer at Microsoft, where he leads the team responsible for managing Azure's computing capacity and datacenter infrastructure with a strong emphasis on efficiency and sustainability. Before joining Azure in 2022, he led the Systems Research Group and the Cloud Efficiency team at Microsoft Research. During that time, he collaborated closely with Azure to create and deploy Resource Central, an ML and prediction-serving system that provides intelligence to other Azure components. He is an ACM Fellow and an IEEE Fellow. Abstract: In this talk, I will describe our vision, experience, and latest advances in ML for Systems in Azure. Among other topics, I will discuss our latest research and production experience in infusing ML into cloud platforms and services. |
🔗 |
Thu 10:00 a.m. - 10:30 a.m.
|
Invited Talk #1: The Evolution of Model Deployment from the Cloud to the Sky
(
Invited Talk
)
Joseph Gonzalez is a founding member of the UC Berkeley Sky Computing Lab and the RISE Lab where he studies the design of future cloud architectures and machine learning systems. He is also a member of the Berkeley AI Research Group where he works on new neural architectures for computer vision, natural language processing, and robotics. Gonzalez's research addresses problems in neural network design, efficient inference, computer vision, prediction serving, autonomous vehicles, graph analytics, and distributed systems. Building on his research, Gonzalez co-founded Aqueduct to commercialize a radically simpler production data science platform. Finally, Gonzalez helped develop the Data Science program at UC Berkeley and co-created Data100 which is now taught to over 1500 students a semester. Prior to joining Berkeley, Gonzalez co-founded Turi Inc (formerly GraphLab) based on his thesis work and created the GraphX project (now part of Apache Spark). Gonzalez's innovative work has earned him significant recognition, including the Okawa Research Grant, NSF Early Career Award, and the NSF Expedition Award. Abstract: Over the past decade, I have worked on projects ranging from the early graph processing frameworks (GraphLab) and distributed data processing frameworks (Apache Spark) to more recent large-scale systems for machine learning and data processing (Clipper, Ray, CloudBurst, and Modin). I have seen the rise (and fall) of various ML systems and the importance of data and data systems driving the field forward. In this talk, I will present the evolution of prediction serving systems, what we got wrong, and where things are headed. I will introduce our new work on feature stores and try to explain why they exist in the first place. I will then conclude by presenting a new vision for the future of cloud computing, one in which we attempt to defy data gravity and disrupt the economics of the cloud. |
Joseph Gonzalez 🔗 |
Thu 10:30 a.m. - 11:00 a.m.
|
Break
|
🔗 |
Thu 11:00 a.m. - 11:45 a.m.
|
Technical Paper Session
(
Paper Session
)
A Survey of Multi-Tenant Deep Learning Inference on GPU Fuxun Yu, Yongbo Yu (George Mason University); Di Wang (Microsoft); Minjia Zhang (Microsoft AI and Research); Longfei Shangguan (Microsoft); Chenchen Liu (University of Maryland, Baltimore County), Tolga Soyata (GMU); Xiang Chen (George Mason University) CWP: A Machine Learning based Approach to Detect Unknown Cloud Workload Derssie Mebratu, Mohammad Hossain, Niranjan Hasabnis, Jun Jin, Gaurav Chaudhary, Noah Shen (Intel) Multi-level Explanation of Deep Reinforcement Learning-based Scheduling Shaojun Zhang (USYD); Chen Wang (DATA61, CSIRO); Albert Zomaya (The University of Sydney) |
🔗 |
Thu 1:00 p.m. - 1:30 p.m.
|
Invited Talk #2: Towards a General ML for Systems Methodology
(
Invited Talk
)
Martin Maas is a Staff Research Scientist at Google Research and part of the Brain team. His research interests are in language runtimes, computer architecture, systems, and machine learning, with a focus on applying machine learning to systems problems. Before joining Google, Martin completed his PhD in Computer Science at the University of California at Berkeley, where he worked on hardware support for managed languages and architectural support for memory-trace obliviousness. Abstract: Machine learning has the potential to significantly improve computer systems. While recent research in this area has shown great promise, not all problems are equally well-suited for applying ML techniques, and some remaining challenges have prevented wider adoption of ML techniques in systems. In this talk, I will introduce a taxonomy to classify machine learning for systems approaches, discuss how to identify cases that are a good fit for machine learning, and lay out a longer-term vision of how we can improve systems using ML techniques, ranging from computer architecture to language runtimes. |
Martin Maas 🔗 |
Thu 1:30 p.m. - 2:00 p.m.
|
Project Showcases
Auto-scaling for Spot and On-demand VM Mixture Fangkai Yang, Bo Qiao, Eli Cortez, Inigo Goiri, Chetan Bansal, Si Qin, Victor Rühle, Qingwei Lin, Dongmei Zhang (Microsoft) LOGIC: Log Intelligence in Cloud Lingling Zheng (Microsoft); Xu Zhang (Microsoft Research); Ze Li, Cong Chen (Microsoft); Shilin He, Liqun Li (Microsoft Research); Yu Kang (MSRA); Yudong Liu (Microsoft); Qingwei Lin (Microsoft Research); Yingnong Dang, Murali Chintalapati (Microsoft) |
🔗 |
Thu 2:00 p.m. - 2:30 p.m.
|
Break
|
🔗 |
Thu 2:30 p.m. - 3:00 p.m.
|
Closing Keynote: TBD
(
Keynote
)
|
Neeraja Yadwadkar 🔗 |
Thu 3:00 p.m. - 3:30 p.m.
|
Break
|
🔗 |
Thu 3:30 p.m. - 4:30 p.m.
|
Panel: AIOps – Challenges and Opportunities
(
Panel
)
Moderator: Christina Delimitrou, Cornell University Panelist candidates: Daniel Oneill, Stanford University; Neeraja J. Yadwadkar, University of Texas Austin; Dan Crankshaw, Microsoft; Martin Maas, Google |
Neeraja Yadwadkar 🔗 |