Build AI Apps with Azure, Copilot, and Generative AI — Microsoft Certified
AI, Data Science & Cloud Certificates from Google, IBM & Meta
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore an 11-minute video explaining the groundbreaking LLaVA (Large Language and Vision Assistant) paper series, which introduces the first instruction-tuned multimodal foundation model. Learn about the evolution of LLaVA through its various iterations including LLaVA, LLaVA-RLFH, LLaVA-Med, and LLaVA 1.5, discovering how these models combine language and visual capabilities. Gain insights into the technical implementation, access the project's resources including code repositories and datasets, and understand the significance of this advancement in Large Multimodal Models (LMMs). Created by an experienced Machine Learning Researcher, the video breaks down complex concepts while providing comprehensive links to related papers, documentation, and implementation resources.
Syllabus
LLaVA - the first instruction following multi-modal model (paper explained)
Taught by
AI Bites