Introduction to Multimodal Large Language Models I - Day 10 Morning
Center for Language & Speech Processing(CLSP), JHU via YouTube
AI Adoption - Drive Business Value and Organizational Impact
PowerBI Data Analyst - Create visualizations and dashboards from scratch
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the fundamentals of multimodal large language models through comprehensive tutorial slides presented by experts from University of Maryland, Brno University of Technology, and Universidad Autónoma de Madrid. Learn core concepts, architectures, and applications of models that can process and understand multiple types of data including text, images, and audio simultaneously. Discover how these advanced AI systems integrate different modalities to perform complex reasoning tasks, understand cross-modal relationships, and generate coherent responses across various input types. Examine the technical foundations underlying multimodal LLMs, including attention mechanisms, fusion strategies, and training methodologies that enable these models to bridge the gap between different forms of human communication and expression.
Syllabus
[slides] Day 10 morning - JSALT 2025 - Introduction to Multimodal Large Language Models I.
Taught by
Center for Language & Speech Processing(CLSP), JHU