Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about SNARY, a high-performance SmartNIC-accelerated retrieval system designed to optimize the performance bottleneck in large-scale recommendation systems through this 13-minute conference presentation from USENIX ATC '25. Discover how this generic system addresses the critical retrieval stage that selects thousands of relevant candidates from vast corpora containing millions of items, which traditionally becomes the performance bottleneck in two-stage recommendation paradigms. Explore SNARY's innovative architecture that utilizes High-Bandwidth Memory (HBM) for corpus storage and scanning, featuring two distinct search engines: a data parallelism exact search and a Locality-Sensitive Hashing (LSH)-based fuzzy search capability. Understand the system's pipeline-based approach for Top-K item selection and its comprehensive data flow streaming design, implemented on Xilinx commercial SmartNICs. Examine the impressive performance results demonstrating 20.91%-83.88% lower latency and 1.26×-18.27× higher latency-bounded throughput for exact search scenarios, plus 85.13%-87.40% lower latency and 20.18×-23.81× higher latency-bounded throughput for fuzzy search scenarios compared to state-of-the-art hardware-based solutions, making it a significant advancement in supporting both exact and fuzzy search capabilities for modern retrieval systems.