Fine-tuning Multimodal Embeddings for Custom Text-Image Pairs Using CLIP

Fine-tuning Multimodal Embeddings for Custom Text-Image Pairs Using CLIP

Shaw Talebi via YouTube Direct link

Intro - 0:00

1 of 10

1 of 10

Intro - 0:00

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Fine-tuning Multimodal Embeddings for Custom Text-Image Pairs Using CLIP

Automatically move to the next video in the Classroom when playback concludes

  1. 1 Intro - 0:00
  2. 2 Multimodal Embeddings - 0:44
  3. 3 0-shot Use Cases - 2:30
  4. 4 Limitations of CLIP - 3:50
  5. 5 Fine-tuning CLIP - 5:14
  6. 6 Step 1: Gather training data - 6:46
  7. 7 Step 2: Preprocess data - 15:20
  8. 8 Step 3: Define evals - 17:20
  9. 9 Step 4: Fine-tune model - 19:22
  10. 10 Step 5: Evaluate model - 26:04

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.