Timezone: »
Benefited from expanding cloud infrastructure, today's neural networks have increasingly high performance trained on the cloud. Model researchers spent months of sweat competing for an extra few percent of model accuracy. However, when these models are actually deployed on edge devices in practice, very often, the performance is dropping over 10% all of a sudden without obvious reasons. The key challenge is that there is not much visibility to ML inference execution on edge devices, and very little awareness of potential issues during the edge deployment process. ML-EXray provides visibility into layer-level details of the ML execution, helps developers analyze and debug cloud-to-edge deployment issues. More often than not, the reason does not only lie in the model itself, but every operation throughout the data flow and the deployment process. Evaluations show that ML-EXray can effectively catch deployment issues, such as pre-processing bugs, quantization issues, suboptimal kernels; using ML-EXray, users need to write less than 15 line of code to fully examine the edge deployment pipeline; eradicating these issues, ML-EXray can correct model performance by up to 30%, pinpoint error-prune layers, guide users to optimize kernel execution latency by two orders of magnitude. Code and APIs will be released as a multi-lingual instrumentation library and a Python deployment validation library.
Author Information
Hang Qiu (Stanford University)
Ioanna Vavelidou (Stanford University)
Jian Li (Google Inc.)
Evgenya Pergament (Stanford University)
Pete Warden (Google)
Sandeep Chinchali (Stanford University)
Zain Asgar (Stanford University)
Sachin Katti (Stanford University)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Oral: ML-EXray: Visibility into ML Deployment on the Edge »
Mon. Aug 29th 04:39 -- 04:57 PM Room Exhibit Hall A
More from the Same Authors
-
2023 Workshop: The 3rd On-Device Intelligence Workshop »
Vijay Janapa Reddi · Paul Whatmough · Vikas Chandra · Pete Warden · Brian Plancher · Colby Banbury · Matthew Stewart -
2023 : Opening Remarks »
Pete Warden -
2021 Workshop: 2nd On-Device Intelligence Workshop »
Paul Whatmough · Vijay Janapa Reddi · Chuteng Zhou · Igor Federov · Matthew Mattina · Pete Warden · Ganesh Venkatesh · Vikas Chandra -
2021 Poster: Characterizing and Taming Model Instability Across Edge Devices »
Eyal Cidon · Evgenya Pergament · Zain Asgar · Asaf Cidon · Sachin Katti -
2021 Poster: TensorFlow Lite Micro: Embedded Machine Learning for TinyML Systems »
Robert David · Jared Duke · Advait Jain · Vijay Janapa Reddi · Nat Jeffries · Jian Li · Nick Kreeger · Ian Nappier · Meghna Natraj · Tiezhen Wang · Pete Warden · Rocky Rhodes · Rocky Rhodes -
2021 Oral: TensorFlow Lite Micro: Embedded Machine Learning for TinyML Systems »
Robert David · Jared Duke · Advait Jain · Vijay Janapa Reddi · Nat Jeffries · Jian Li · Nick Kreeger · Ian Nappier · Meghna Natraj · Tiezhen Wang · Pete Warden · Rocky Rhodes · Rocky Rhodes -
2021 Oral: Characterizing and Taming Model Instability Across Edge Devices »
Eyal Cidon · Evgenya Pergament · Zain Asgar · Asaf Cidon · Sachin Katti -
2020 Workshop: On-Device Intelligence »
Vikas Chandra · Pete Warden · Ganesh Venkatesh · Yingyan Lin