Introduction to Multimodal Large Language Models II - Day 10 Afternoon

Explore advanced concepts in multimodal large language models through this laboratory-focused tutorial session. Engage with hands-on Python notebook exercises covering simple examples, practical implementation of AF2 and AF3 models, experimentation with MMAU-Pro, data preparation for AQA (Audio Question Answering), and basic training procedures. Learn from experts Alicia Lozano-Diez from Universidad Autónoma de Madrid and Ramani Duraiswami from University of Maryland as they guide you through practical applications in the field of large audio language models. Access comprehensive resources including GitHub repositories and Colab notebooks to reinforce your understanding of expert-level reasoning and understanding in multimodal AI systems. Build upon foundational knowledge from the prerequisite Introduction to Multimodal Large Language Models I session while developing practical skills in implementing and training these advanced models.