Practical Multimodal Embeddings - Video Recommendations and Cross-Modal Search
Qdrant - Vector Database & Search Engine via YouTube
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to build practical multimodal embedding systems for video recommendations and cross-modal search in this 16-minute conference talk from Qdrant's Vector Space Day 2025. Explore real developer workflows that embed text, audio, images, and video content to power semantic search and recommendation engines using the TwelveLabs Embed API (Marengo-retrieval-2.7) combined with Qdrant vector database for fast similarity queries with metadata filtering. Discover how to replace shallow metadata matching with semantic relevance in video recommendation systems, and examine cross-modal retrieval use cases including image-to-video and audio-to-video search. Master chunking strategies for long video content, labeling techniques, and evaluation methods that accurately reflect user intent. Gain actionable patterns for designing indexing pipelines, schema architecture, and retrieval queries that deliver meaningful results across multiple modalities, with coverage of basic embedding concepts and Python implementation targeting intermediate practitioners.
Syllabus
Practical Multimodal Embeddings: Video Recommendations and Cross-Modal Search | TwelveLabs | Yadav
Taught by
Qdrant - Vector Database & Search Engine