Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about a revolutionary approach to billion-scale hybrid retrieval that unifies dense and sparse vectors, keywords, filters, and user-defined scoring functions into a single distributed query. This 41-minute conference talk from Haystack EU 2025 explores the limitations of current vector database solutions where similarity doesn't always equal relevance, and examines how existing "hybrid retrieval" implementations rely on separate queries across disjoint indexes with late fusion of results. Discover a fundamentally different model that eliminates the need for separate indexes and late fusion by designing storage formats and query engines from the ground up to handle unified retrieval at scale. Explore how this innovative approach provides a flexible query language that empowers search practitioners to optimize relevance for their specific domains without managing and synchronizing multiple data stores. Gain insights into overcoming the challenges of traditional embedding-based retrieval systems and understand how unified hybrid retrieval can transform large-scale search implementations.
Syllabus
Haystack EU 2025: Marek Galovic – Billion-scale hybrid retrieval in a single query
Taught by
OpenSource Connections