Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

NPTEL

Generative AI for Computer Vision

NPTEL via Swayam

Overview

AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
ABOUT THE COURSE:

This course explores how Generative AI is applied to modern computer vision tasks. Unlike existing NPTEL courses, it specifically emphasized on vision-based generative AI models. It begins with mathematical foundations and classical vision techniques, followed by deep learning architectures. The course then introduces generative learning paradigms including GANs, VAEs, diffusion models, and transformers with a discussion regarding evaluation metrics and training challenges like mode collapse, diffusion noise scheduling, etc. Moreover, it includes LLM models for vision applications like GPT-4V, LLaMA, PaLM-E, Flamingo, etc. This course is primarily focusing on deep generative learning for computer vision tasks like Image Captioning, VQA, Scene Understanding etc. It further discusses multimodal generative models and agentic AI systems for automatic image synthesis and reasoning.
INTENDED AUDIENCE: Final/Pre-final year B.Tech/BE, M.Tech/ME, MS, PhD students, Industry professionals, and Faculty members.

PREREQUISITES: Basics of Machine Learning and Computer Vision. Neural Networks for Vision and NLP.

INDUSTRY SUPPORT: Relevant for AI/ML roles in IT companies, startups, research labs, and product-based companies working in generative AI and computer vision domains.

Taught by

Prof. Arijit Sur

Reviews

Start your review of Generative AI for Computer Vision

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.