Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Unlocking Audio AI - Building a Massive Open Dataset for Instruction-Tuned Audio-Text Foundation Models

Center for Language & Speech Processing(CLSP), JHU via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the future of audio AI through this seminar featuring Christoph Schuhmann, co-founder of LAION, who presents his vision for building massive open datasets to enable instruction-tuned audio-text foundation models. Learn about the revolutionary potential of applying large-scale instruction tuning techniques from language models to the audio domain, discovering how millions of permissively licensed audio snippets—including speech, music, and environmental sounds—can be transformed into richly annotated datasets. Understand the practical roadmap for creating billions of carefully annotated soundscapes using advanced generative AI like Gemini, complete with accurate timestamped event labels. Discover how such datasets could enable transformative progress across audio-to-text transcription, sound-event detection, audio generation, and multimodal voice assistants. Gain insights into LAION's commitment to open science and accessible AI development from the leader behind landmark datasets like LAION-400M and LAION-5B that powered models like Stable Diffusion. Examine the intersection of educational reform, open-source AI development, and the democratization of foundational AI technologies through the perspective of a physics and computer science educator who champions transparent, community-driven research initiatives.

Syllabus

[camera] JSALT 2025 - Seminar with Christoph Schuhmann (LAION)

Taught by

Center for Language & Speech Processing(CLSP), JHU

Reviews

Start your review of Unlocking Audio AI - Building a Massive Open Dataset for Instruction-Tuned Audio-Text Foundation Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.