Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Unlock Multimodal Search

Coursera via Coursera

Go to class Write review

Details

Go to class

Provider

Coursera
Pricing

Paid Course
Languages

English
Certificate

Certificate Available
Effort

1 hour 16 minutes
Sessions

Self-Paced
Level

Intermediate
Subtitles

English

Found in

Overview

AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off

One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.

Unlock All Certificates

"Unlock Multimodal Search" is an intermediate, hands-on course for developers and ML engineers ready to build the next generation of AI-powered search. Text-only search is no longer enough; this 90-minute course will teach you how to create applications that can search across different data types, such as finding text from an image. Using the powerful open-source vector database Weaviate, you will move from theory to a functioning demonstration. This course requires basic skills in Docker, APIs, Python, and the command line (CLI). Familiarity with vector databases. Docker Desktop must be installed. This course is focused on execution. You will learn to configure a Weaviate schema to handle both image and text embeddings for a single object, ingest multimodal data, and perform powerful cross-modal queries. Through a final, hands-on project that mirrors a real-world job task, you will not only build an image-to-text search demo but also learn how to measure its accuracy with precision metrics. By the end, you'll be equipped to architect and validate sophisticated, multimodal AI applications.

Syllabus

Storing and Ingesting Multimodal Data

This module provides the foundation for multimodal search. You will learn how to configure a Weaviate instance and design a schema that can store both image and text embeddings for a single object, preparing your database for cross-modal queries.

Cross-Modal Querying and Analysis

Now that your database is configured, this module focuses on execution and validation. You will learn how to perform cross-modal queries to find text from an image and, critically, how to analyze the accuracy of your results using precision metrics.