Beyond Model Serving: Cross-Stack Co-Design for Agentic Systems
Esha Choukse
Abstract
AI is moving from single-model inference to interactive, multimodal, and agentic systems. In this new regime, performance depends on co-design across the full stack, not on models or hardware alone. This talk argues for rethinking the boundary between machine learning and computer systems, and for treating accuracy and quality as dynamic system-level quantities that can be traded against latency, cost, and energy.
Speaker
Esha Choukse
Esha Choukse is a Principal Researcher in the Azure Research — Systems (AzRS) group at Microsoft. Her research focuses on efficient and sustainable AI across the computing stack, spanning AI platforms, hardware, and datacenter-scale infrastructure. She is a recipient of the ACM SIGMICRO Early Career Award for foundational contributions to hardware memory compression and to sustainable and efficient datacenter systems. Her papers have received three IEEE Micro Top Picks and an HPCA Best Paper Award. Several of her projects, including Splitwise and power stabilization in AI training datacenters, have had far-reaching impact on the research community and are deployed broadly across industry. Esha received her Ph.D. from The University of Texas at Austin in 2019 and has published extensively in leading venues including ISCA, ASPLOS, MICRO, HPCA, NSDI, and SC.
Successful Page Load