Timezone: »
There is often variation in the shape and size of input data used for deep learning. In many cases, such data can be represented using tensors with non-uniform shapes, or ragged tensors. Due to limited and non-portable support for efficient execution on ragged tensors, current deep learning frameworks generally use techniques such as padding and masking to make the data shapes uniform and then offload the computations to optimized kernels for dense tensor algebra. Such techniques can, however, lead to a lot of wasted computation and therefore, a loss in performance. This paper presents CoRa, a tensor compiler that allows users to easily generate efficient code for ragged tensor operators targeting a wide range of CPUs and GPUs. Evaluating CoRa on a variety of operators on ragged tensors as well as on an encoder layer of the transformer model, we find that CoRa (i) performs competitively with hand-optimized implementations of the operators and the transformer encoder and (ii) achieves a 1.6 geomean speedup over PyTorch for the encoder on an Nvidia GPU and a 1.37 geomean speedup over TensorFlow for the multi-head attention module used in transformers on a 64-core ARM CPU.
Author Information
Pratik Fegade (Carnegie Mellon University)
I am a fifth year PhD student in the Computer Science Department at CMU, advised by Prof. Todd Mowry and Prof. Phil Gibbons. My main research focus is to build compiler analysis techniques to understand and optimize programs at semantically higher levels than is possible now.
Tianqi Chen (CMU)
Phillip Gibbons (CMU)
Todd Mowry (Carnegie Mellon University)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Oral: The CoRa Tensor Compiler: Compilation for Ragged Tensors with Minimal Padding »
Mon. Aug 29th 09:15 -- 09:33 PM Room Exhibit Hall A
More from the Same Authors
-
2022 Poster: DietCode: Automatic Optimization for Dynamic Tensor Programs »
Bojian Zheng · Ziheng Jiang · Cody Hao Yu · Haichen Shen · Joshua Fromm · Yizhi Liu · Yida Wang · Luis Ceze · Tianqi Chen · Gennady Pekhimenko -
2022 Oral: DietCode: Automatic Optimization for Dynamic Tensor Programs »
Bojian Zheng · Ziheng Jiang · Cody Hao Yu · Haichen Shen · Joshua Fromm · Yizhi Liu · Yida Wang · Luis Ceze · Tianqi Chen · Gennady Pekhimenko -
2021 Poster: Cortex: A Compiler for Recursive Deep Learning Models »
Pratik Fegade · Tianqi Chen · Phillip Gibbons · Todd Mowry -
2021 Oral: Cortex: A Compiler for Recursive Deep Learning Models »
Pratik Fegade · Tianqi Chen · Phillip Gibbons · Todd Mowry -
2021 : Q&A for Tianqi Chen »
Tianqi Chen -
2021 : TVM »
Tianqi Chen -
2021 Symposium: Chips and Compilers Symposium »
Mu Li · Tianqi Chen