Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

The Computer Vision Bootcamp

via Zero To Mastery Path

Go to class Write review

Overview

Learn how Computer Vision models work, including Vision Transformers and Meta’s SAM, and how they power real-world image systems. Then put your knowledge into practice by deploying a scalable computer vision pipeline on AWS using production-ready tools and infrastructure.

Understand how Vision Transformers process images
Break down attention math without the hand-waving
Use Meta’s SAM for prompt-based segmentation
Visualize and evaluate segmentation outputs
Connect detection models with segmentation pipelines
Build scalable computer vision workflows in Python
Deploy vision systems on AWS infrastructure
Design production-ready AI pipelines for real products

Syllabus

Introduction

Introduction
What We're Building
Exercise: Meet Your Classmates and Instructor
Course Resources
ZTM Plugin + Understanding Your Video Player
Set Your Learning Streak Goal

Mathematics behind Vision Transformers

Vision Transformers vs Convolutional Neural Networks
Quadratic Operations
Introduction to ViTs and Joint Training with Embeddings
Understanding Attention Mechanisms, Brief Summary
Understanding the Full ViT Pipeline
Let's Have Some Fun (+ More Resources)

Mathematics Behind Meta's SAM (Segment Anything Model)

Introduction to Prompt Encoders for SAM
SAM AutoPrompt Mode
SAM Manual Click Mode
ViT Embeddings inside SAM
Calculating Attention Score for Vision Transformers in SAM
How SAM is Trained
Calculating Prompt Self Attention for SAM
Prompt Image Cross Attention
Image to Prompt Cross Attention
(Optional) Finishing SAM Example Part 1
(Optional) Finishing SAM Example Part 2
Finishing
Unlimited Updates

Setting up Our AWS Environment

Creating our SagemakerAI Domain
Starting Domain and Understanding Pricing
Installing Libraries
Stopping Instances and Servers
Course Check-In

Setting up Open Source Models Like Meta's SAM

Downloading the SAM Model from Meta
Updating IAM Permissions
Importing Libraries
Understanding how we use Rekognition with SAM
Defining Helper Functions
Clarification Regarding Helper Functions
Rekognition Detection and Filtering
Initialise SAM Model from S3
Main Processing Function Part 1
Main Processing Function Part 2
Running the Main Processing Cell
Implement a New Life System

Visualizing our Outputs

Visualizing Rekognition Detections
Visualize All SAM Masks
Visualizing Match Quality IOU Scores Part 1
Visualizing Match Quality IOU Scores Part 2
Visualizing Image Segmentations with Bounding Boxes
Visualizing Masks and Labels Without Bounding Boxes
Visualizing Segementations in Black and White Masks
Exercise: Imposter Syndrome

Saving Results to S3

Saving Metadata to S3
Save Images to S3
Saving Individual Masks to S3

Testing + Setup

Adding a GPU Server to our Notebook and AWS Quotas
Testing Our Full Pipeline
Minor Corrections
Productionizing + Cleanup

Where To Go From Here?

Thank You!
Review This Course!
Become An Alumni
Learning Guideline
ZTM Events Every Month
LinkedIn Endorsements

Taught by

Patrik Szepesi

Reviews

Start your review of The Computer Vision Bootcamp