Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This tutorial demonstrates how to fine-tune a Vision Transformer (ViT) model to classify custom image datasets of vehicles using PyTorch. Learn the complete workflow from loading and transforming image datasets to fine-tuning the 'google/vit-base-patch16-224-in21k' model from Hugging Face. Master essential skills including dataset preparation, model training optimization, and testing predictions on new images. The 34-minute video walks through installation, dataset exploration, model fine-tuning, and prediction testing with practical code examples. Access the complete tutorial code through the provided link and explore additional computer vision and visual language model resources available on the creator's blog and YouTube playlists.
Syllabus
00:00 Introduction
03:31 Installation
06:51 Discover the dataset
08:27 Fine-tune the VIT model
25:33 Test the model Prediction
Taught by
Eran Feit