Practical Multimodal Embeddings - Video Recommendations and Cross-Modal Search

AI, Data Science & Business Certificates from Google, IBM & Microsoft

Learn More →

Build with Azure OpenAI, Copilot Studio & Agentic Frameworks — Microsoft Certified

Learn More →

Overview

Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off

One annual plan covers every course and certificate on Coursera. 40% off for a limited time.

Get Full Access

Learn to build practical multimodal embedding systems for video recommendations and cross-modal search in this 16-minute conference talk from Qdrant's Vector Space Day 2025. Explore real developer workflows that embed text, audio, images, and video content to power semantic search and recommendation engines using the TwelveLabs Embed API (Marengo-retrieval-2.7) combined with Qdrant vector database for fast similarity queries with metadata filtering. Discover how to replace shallow metadata matching with semantic relevance in video recommendation systems, and examine cross-modal retrieval use cases including image-to-video and audio-to-video search. Master chunking strategies for long video content, labeling techniques, and evaluation methods that accurately reflect user intent. Gain actionable patterns for designing indexing pipelines, schema architecture, and retrieval queries that deliver meaningful results across multiple modalities, with coverage of basic embedding concepts and Python implementation targeting intermediate practitioners.