Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This conference talk from FAST '25 presents FusionANNS, a novel system for high-throughput and low-latency billion-scale vector search that leverages CPU/GPU collaborative filtering and re-ranking. Discover how researchers from Huazhong University of Science and Technology and Huawei Technologies address the performance, cost, and accuracy challenges in Approximate Nearest Neighbor Search (ANNS) services. Learn about their innovative approach using SSDs and a single entry-level GPU, featuring multi-tiered indexing to prevent data swapping, heuristic re-ranking to eliminate unnecessary I/O operations, and redundant-aware I/O deduplication for improved efficiency. The 16-minute presentation demonstrates how FusionANNS achieves 9.4-13.1× higher query per second and 5.7-8.8× higher cost efficiency compared to SPANN, while also outperforming RUMMY with 2-4.9× higher QPS and 2.3-6.8× better cost efficiency, all while maintaining low latency and high accuracy.
Syllabus
FAST '25 - Towards High-throughput and Low-latency Billion-scale Vector Search via CPU/GPU...
Taught by
USENIX