How We Built Vector Search in the Cloud

Real-time updates: Rockset supports inserts, updates and deletes of vectors and metadata. It’s built on RocksDB, an open-source embedded storage engine designed for mutability. When a vector is inserted or modified, Rockset computes its Voronoi cell using FAISS and then adds or updates the closest centroid and residual value to the search index. New data is reflected in searches in milliseconds.
Hybrid search with SQL: Rockset stores and indexes vectors alongside text, JSON and time series data. It leverages both the search index and the similarity index in parallel. Using FAISS, the K nearest centroids to the target vector are identified. Results are filtered by the K nearest centroids and metadata terms using the search index, a concept known as single-stage filtering.
Separation of indexing and search: With compute-compute separation, similarity indexing of vectors will not affect search performance. Ingestion and indexing happen on different virtual instances (clusters) than search for predictable performance as you scale.

Tudor Bosman, Chief Architect at Rockset

Tudor Bosman leads architecture for Rockset's search and analytics database. Prior to Rockset, Tudor was an engineer at Facebook, where he spearheaded Unicorn, Facebook's search engine, and built infrastructure for the Facebook AI Research Lab and Facebook's applied machine learning initiative. Prior to Facebook, Tudor worked at Google on Gmail's storage and indexing backend, and at Oracle on database server internals. Tudor holds an MS in Computer Science from Stanford and a BS in Computer Science from Caltech.

How We Built Vector Search in the Cloud

Wednesday, December 6th at 10am PT/1 pm ET

About the Speakers

Tudor Bosman, Chief Architect at Rockset

Daniel Latta-Lin, Engineer

Register now