Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Vision Transformers - Explained

CodeEmporium via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore Vision Transformers (ViT) in this comprehensive 22-minute educational video that demystifies this groundbreaking computer vision architecture. Learn what Vision Transformers are and understand the fundamental reasons behind their development as an alternative to traditional convolutional neural networks for image processing tasks. Discover how ViTs adapt the transformer architecture, originally designed for natural language processing, to handle visual data by treating image patches as sequences. Dive deep into the pretraining process, understanding how these models learn robust visual representations from large datasets, and master the fine-tuning techniques used to adapt pretrained ViT models for specific downstream tasks. Test your knowledge with an interactive quiz section and consolidate your learning through a comprehensive summary that reinforces key concepts. The tutorial includes access to detailed slides, references to the original Vision Transformer research paper, and connections to foundational transformer concepts, making it suitable for machine learning practitioners, computer vision enthusiasts, and researchers looking to understand this influential architecture that has revolutionized how we approach image classification and visual understanding tasks.

Syllabus

What is ViT?
Why do we have ViTs?
Pretraining
Fine tuning
Quiz Time
Summary

Taught by

CodeEmporium

Reviews

Start your review of Vision Transformers - Explained

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.