Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Zero To Mastery

The Computer Vision Bootcamp

via Zero To Mastery Path

Overview

Learn how Computer Vision models work, including Vision Transformers and Meta’s SAM, and how they power real-world image systems. Then put your knowledge into practice by deploying a scalable computer vision pipeline on AWS using production-ready tools and infrastructure.
  • Understand how Vision Transformers process images
  • Break down attention math without the hand-waving
  • Use Meta’s SAM for prompt-based segmentation
  • Visualize and evaluate segmentation outputs
  • Connect detection models with segmentation pipelines
  • Build scalable computer vision workflows in Python
  • Deploy vision systems on AWS infrastructure
  • Design production-ready AI pipelines for real products

Syllabus

  •   Introduction
    • Introduction
    • What We're Building
    • Exercise: Meet Your Classmates and Instructor
    • Course Resources
    • ZTM Plugin + Understanding Your Video Player
    • Set Your Learning Streak Goal
  •   Mathematics behind Vision Transformers
    • Vision Transformers vs Convolutional Neural Networks
    • Quadratic Operations
    • Introduction to ViTs and Joint Training with Embeddings
    • Understanding Attention Mechanisms, Brief Summary
    • Understanding the Full ViT Pipeline
    • Let's Have Some Fun (+ More Resources)
  •   Mathematics Behind Meta's SAM (Segment Anything Model)
    • Introduction to Prompt Encoders for SAM
    • SAM AutoPrompt Mode
    • SAM Manual Click Mode
    • ViT Embeddings inside SAM
    • Calculating Attention Score for Vision Transformers in SAM
    • How SAM is Trained
    • Calculating Prompt Self Attention for SAM
    • Prompt Image Cross Attention
    • Image to Prompt Cross Attention
    • (Optional) Finishing SAM Example Part 1
    • (Optional) Finishing SAM Example Part 2
    • Finishing
    • Unlimited Updates
  •   Setting up Our AWS Environment
    • Creating our SagemakerAI Domain
    • Starting Domain and Understanding Pricing
    • Installing Libraries
    • Stopping Instances and Servers
    • Course Check-In
  •   Setting up Open Source Models Like Meta's SAM
    • Downloading the SAM Model from Meta
    • Updating IAM Permissions
    • Importing Libraries
    • Understanding how we use Rekognition with SAM
    • Defining Helper Functions
    • Clarification Regarding Helper Functions
    • Rekognition Detection and Filtering
    • Initialise SAM Model from S3
    • Main Processing Function Part 1
    • Main Processing Function Part 2
    • Running the Main Processing Cell
    • Implement a New Life System
  •   Visualizing our Outputs
    • Visualizing Rekognition Detections
    • Visualize All SAM Masks
    • Visualizing Match Quality IOU Scores Part 1
    • Visualizing Match Quality IOU Scores Part 2
    • Visualizing Image Segmentations with Bounding Boxes
    • Visualizing Masks and Labels Without Bounding Boxes
    • Visualizing Segementations in Black and White Masks
    • Exercise: Imposter Syndrome
  •   Saving Results to S3
    • Saving Metadata to S3
    • Save Images to S3
    • Saving Individual Masks to S3
  •   Testing + Setup
    • Adding a GPU Server to our Notebook and AWS Quotas
    • Testing Our Full Pipeline
    • Minor Corrections
    • Productionizing + Cleanup
  •   Where To Go From Here?
    • Thank You!
    • Review This Course!
    • Become An Alumni
    • Learning Guideline
    • ZTM Events Every Month
    • LinkedIn Endorsements

Taught by

Patrik Szepesi

Reviews

Start your review of The Computer Vision Bootcamp

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.