Introduction to Multimodal Large Language Models II - Day 10 Afternoon
Center for Language & Speech Processing(CLSP), JHU via YouTube
Earn a Michigan Engineering AI Certificate — Stay Ahead of the AI Revolution
The Most Addictive Python and SQL Courses
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore advanced concepts in multimodal large language models through this laboratory-focused tutorial session. Engage with hands-on Python notebook exercises covering simple examples, practical implementation of AF2 and AF3 models, experimentation with MMAU-Pro, data preparation for AQA (Audio Question Answering), and basic training procedures. Learn from experts Alicia Lozano-Diez from Universidad Autónoma de Madrid and Ramani Duraiswami from University of Maryland as they guide you through practical applications in the field of large audio language models. Access comprehensive resources including GitHub repositories and Colab notebooks to reinforce your understanding of expert-level reasoning and understanding in multimodal AI systems. Build upon foundational knowledge from the prerequisite Introduction to Multimodal Large Language Models I session while developing practical skills in implementing and training these advanced models.
Syllabus
[camera] Day 10 afternoon - JSALT 2025 - Introduction to Multimodal Large Language Models II
Taught by
Center for Language & Speech Processing(CLSP), JHU