Learn Generative AI, Prompt Engineering, and LLMs for Free
Get 20% off all career paths from fullstack to AI
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn to build a modern image classifier using ConvNeXt architecture in this hands-on tutorial that demonstrates fine-tuning facebook/convnext-base-224-22k to classify 30 musical instrument classes. Master the complete machine learning pipeline from data preparation to model evaluation using Python, PyTorch, and Hugging Face tools. Load folder-based datasets with Hugging Face Datasets, implement practical data augmentations including RandomResizedCrop and normalization, and construct robust PyTorch DataLoaders with proper label mapping. Fine-tune the pre-trained ConvNeXt model using AdamW optimizer while tracking loss and accuracy metrics, implement early stopping to prevent overfitting, and save the best performing checkpoint for future use. Execute single-image inference to test your trained model and create confusion matrices to analyze classification performance and identify areas where the model excels or struggles. Discover how ConvNeXt combines the efficiency of traditional CNNs with innovative concepts borrowed from Vision Transformers, making it an excellent choice for modern computer vision tasks.
Syllabus
00:00 Introduction and Demo
03:28 Installation
07:07 Start coding
17:08 Build and train the model
23:44 Generate Confusion Matrix
Taught by
Eran Feit