Get 35% Off CFI Certifications - Code CFI35
PowerBI Data Analyst - Create visualizations and dashboards from scratch
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a 14-minute technical video breakdown of Meta's Segment Anything Model 2 (SAM2), which extends the revolutionary SAM technology from image to video segmentation. Learn about the challenges of video segmentation, understand the model's architecture including the image encoder, memory encoder, memory bank, and memory attention mechanisms. Discover how the data engine generates the largest video dataset to date (SA-V dataset), and examine the experimental results that demonstrate SAM2's capabilities. Delivered by a machine learning researcher with 15 years of software engineering experience and a Master's in Computer Vision and Robotics, dive deep into the technical components of promptable visual segmentation and the end-to-end architecture that makes video object segmentation possible.
Syllabus
- Intro
- Challenges with video segmentation
- Overview of SAM2
- Promptable Visual Segmentation
- SAM2 Model
- End to end architecture
- Image Encoder
- Memory Encoder
- Memory Bank
- Memory Attention
- Training
- Data Engine
- Segment Anything Video SA-V dataset
- Experiments
Taught by
AI Bites