Online Experimentation for Cloud Applications


Room 203
[ Abstract ]
[ Slides
Wed 31 Aug 8 a.m. PDT — 10:15 a.m. PDT


The need to deliver code changes to production systems to satisfy new requirements has fueled the adoption of an agile software development practice called onlineexperimentation. Online experimentation provides insight into the value delivered by new application versions as they are exposed to users.

To solve the online experimentation problem for web and mobile applications, practitioners use A/B tests or more advanced methods such as multi-armed bandit algorithms. These approaches entail comparing and assessing application versions online to determine the best version based on business requirements such as user-engagement. However, existing techniques and their formulations do not capture the unique complexities in cloud systems.

When assessing the outcomes of releases of microservices or machine learning (ML) models in the cloud, practitioners must simultaneously consider application performance as well as business metrics. This difference arises because a cloud application’s behavior is inherently volatile due to an increased likelihood of performance bugs or variability, which can degrade desired business results. For example, Amazon reported that every 100ms of latency costs them 1% in sales. As a result of these complexities, the deployment of cloud applications is more art than science when contrasted with the approaches adopted in the web and mobile domains. However, practitioners lack rigorous solutions for code releases, making it difficult to automatically learn and optimize for both business metrics and application performance with statistical guarantees.

This tutorial aims to provide a new perspective to rethink online experimentation in the cloud era. The tutorial will study the field of online experimentation and popular existing approaches, address their shortcomings in the cloud, and discuss key challenges and requirements for real-world solutions. Participants will get a chance to craft and run an
online experiment on an open-source system designed for online experimentation of microservices and ML models deployed on the cloud.

Chat is not available.