Introduction to Multimodal Large Language Models II - Day 10 Afternoon
Center for Language & Speech Processing(CLSP), JHU via YouTube
Power BI Fundamentals - Create visualizations and dashboards from scratch
Start speaking a new language. It’s just 3 weeks away.
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore advanced concepts in multimodal large language models through this laboratory-focused tutorial session. Engage with hands-on Python notebook exercises covering simple examples, practical implementation of AF2 and AF3 models, experimentation with MMAU-Pro, data preparation for AQA (Audio Question Answering), and basic training procedures. Learn from experts Alicia Lozano-Diez from Universidad Autónoma de Madrid and Ramani Duraiswami from University of Maryland as they guide you through practical applications in the field of large audio language models. Access comprehensive resources including GitHub repositories and Colab notebooks to reinforce your understanding of expert-level reasoning and understanding in multimodal AI systems. Build upon foundational knowledge from the prerequisite Introduction to Multimodal Large Language Models I session while developing practical skills in implementing and training these advanced models.
Syllabus
[camera] Day 10 afternoon - JSALT 2025 - Introduction to Multimodal Large Language Models II
Taught by
Center for Language & Speech Processing(CLSP), JHU