Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Introduction to Multimodal Large Language Models I - Day 10 Morning

Center for Language & Speech Processing(CLSP), JHU via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the fundamentals of multimodal large language models through comprehensive tutorial slides presented by experts from University of Maryland, Brno University of Technology, and Universidad Autónoma de Madrid. Learn core concepts, architectures, and applications of models that can process and understand multiple types of data including text, images, and audio simultaneously. Discover how these advanced AI systems integrate different modalities to perform complex reasoning tasks, understand cross-modal relationships, and generate coherent responses across various input types. Examine the technical foundations underlying multimodal LLMs, including attention mechanisms, fusion strategies, and training methodologies that enable these models to bridge the gap between different forms of human communication and expression.

Syllabus

[slides] Day 10 morning - JSALT 2025 - Introduction to Multimodal Large Language Models I.

Taught by

Center for Language & Speech Processing(CLSP), JHU

Reviews

Start your review of Introduction to Multimodal Large Language Models I - Day 10 Morning

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.