Skip to yearly menu bar Skip to main content


Invited Talk
in
Workshop: Practical Adoption Challenges of ML for Systems in Industry (PACMI)

ML-driven Cloud Resource Management

Neeraja Yadwadkar


Abstract:

The variety of user workloads, application requirements, heterogeneous hardware resources, and large number of management tasks have rendered today’s cloud fairly complex. Recent work has shown promise in using Machine Learning for efficient resource management for such dynamically changing cloud execution environments. These approaches range from offline to online learning agents. In this talk, I will focus on the challenges that arise when building such agents and those that arise when these agents are deployed in real systems. To do so, I will use SmartHarvest, a system that improves utilization of resources by dynamically harvesting spare CPU cores from primary workloads to run batch workloads on cloud servers, as an example. Building on that, I will briefly talk about SOL, a framework that assists developers in building and deploying online learning agents for various use-cases.

Bio: Neeraja is an assistant professor in the department of Electrical and Computer Engineering at UT Austin. She is a Cloud Computing Systems researcher, with a strong background in Machine Learning (ML). Most of her research straddles the boundaries of Systems and ML: using and developing ML techniques for systems, and building systems for ML. Before joining UT Austin, she was a postdoctoral fellow in the Computer Science department at Stanford University and before that, received her PhD in Computer Science from UC Berkeley. She had previously earned a bachelors in Computer Engineering from the Government College of Engineering, Pune, India.

Chat is not available.