Master AI & Data—50% Off Udacity (Code CC50)
AI Adoption - Drive Business Value and Organizational Impact
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how Pinterest successfully integrated Large Language Models into their search infrastructure to enhance relevance scoring and user experience in this 18-minute conference talk. Discover Pinterest's approach to combining search queries with rich multimodal content, including visual captions, link-based text, and user curation signals through LLM integration. Explore their semi-supervised learning framework that enables scaling to large multilingual datasets beyond English and limited human labels. Understand how Pinterest distilled LLM-driven models into efficient architectures for real-time serving while maintaining performance improvements. Examine the practical implementation challenges and solutions, including the use of VLM-generated captions and user actions as content annotations, knowledge distillation techniques for productionizing LLMs, and the development of relevance-tuned LLM embeddings as general-purpose semantic representations. Gain insights from experimental validation and large-scale deployment results that demonstrate substantial improvements in search relevance for Pinterest's global user base, with detailed coverage of the search backend architecture, relevance modeling approaches, and real-world performance metrics.
Syllabus
[00:00] Introduction to Pinterest and its search functionality.
[01:52] Overview of the Pinterest search backend architecture.
[02:29] The search relevance model.
[02:55] Key learnings from using LLMs for search relevance.
[05:04] The value of VLM-generated captions and user actions as content annotations.
[07:16] Productionizing LLMs with knowledge distillation.
[12:14] The utility of relevance-tuned LLM embeddings as general-purpose semantic representations.
[13:55] Q&A session.
Taught by
AI Engineer