MLSys Scalability, Latency, Flexibility: The Case for Similarity Search as a Service

Contributed Talk 10
in
Workshop: Personalized Recommendation Systems and Algorithms

Scalability, Latency, Flexibility: The Case for Similarity Search as a Service

Amir Sadoughi

[ Abstract ]

Abstract:

Modern deep learning models can represent arbitrary objects as vectors, also known as embeddings. Software applications can use these deep learning models and their respective embeddings to power a variety of use cases, including personalization, recommendation systems, image search, anomaly detection, and more. To date, software engineers could build these systems by integrating open source k-nearest neighbor libraries with an off-the-shelf web server. However, using such a solution presents serious challenges in the face of scalability, latency, and flexibility. To address these challenges, we built Pinecone, providing similarity search as a service.

Contributed Talk 10 in Workshop: Personalized Recommendation Systems and Algorithms

Scalability, Latency, Flexibility: The Case for Similarity Search as a Service

Amir Sadoughi

Contributed Talk 10
in
Workshop: Personalized Recommendation Systems and Algorithms