Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Coursera

Unify Modalities: Cross-Modal Retrieval

Coursera via Coursera

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Transform how AI systems understand and connect different data modalities. This course empowers machine learning professionals to build cutting-edge cross-modal retrieval systems that bridge the gap between text and images. You'll master the technical implementation of approximate nearest-neighbor search algorithms and design sophisticated attention mechanisms that fuse visual and textual information. Through hands-on work with production-scale tools like FAISS and real datasets like Flickr30K, you'll develop the expertise to create intelligent systems that understand content across modalities—enabling breakthrough applications in search, recommendation, and content understanding that mirror how humans naturally process diverse information types.

Syllabus

  • Module 1: ANN Cross-Modal Search - Foundation
    • Learners will build foundational understanding of cross-modal retrieval systems and implement approximate nearest-neighbor search algorithms using FAISS for production-scale similarity search across multimodal embeddings.
  • Module 2: Attention-Based Fusion - Application & Assessment
    • Learners will design and implement sophisticated attention-based fusion algorithms that intelligently combine visual and textual embeddings, mastering the creation of multimodal neural architectures for advanced cross-modal AI applications.

Taught by

Hurix Digital

Reviews

Start your review of Unify Modalities: Cross-Modal Retrieval

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.