In recent years the efficiency landscape of deep learning has been transformed. We can now scale training to support trillion parameter models, and execute inference on micro-controllers with just a few KBs of memory. But progress in ML efficiency has largely ignored the world of graph neural networks (GNNs) in favor of more established neural architectures and tasks. This must change if GNNs are to deliver on their promise of revolutionizing various application domains ranging from drug discovery to recommendation systems. Today GNNs are simply too difficult to scale and deploy.
In this talk, I will describe two of our recent steps towards addressing the open challenges of GNN efficiency. The first, Degree-Quant -- one of the only GNN-specific quantization schemes that we show offers up to 4.7x inference acceleration with negligible impact on accuracy. And second, a brand-new light-weight GNN architecture -- Efficient Graph Convolutions (EGC). Using EGC memory needs are lowered from O(E) to O(V), a quadratic saving, while we find it is still able to achieve SOTA-level GNN performance.
We hope these two methods can act as a useful foundation towards the scalable and efficient GNNs that will be required as this field continues to evolve.