Robust Speech Recognition I - Day 7 Morning
Center for Language & Speech Processing(CLSP), JHU via YouTube
Power BI Fundamentals - Create visualizations and dashboards from scratch
Python, Prompt Engineering, Data Science — Build the Skills Employers Want Now
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn robust speech recognition techniques for real-world conversational scenarios in this comprehensive tutorial from JSALT Summer School 2025. Explore the significant challenges facing automatic speech recognition (ASR) systems when transitioning from clean laboratory conditions to practical applications like meeting transcription, where word error rates can exceed 35% compared to under 3% on clean data. Examine the core obstacles including background noise, reverberation, multiple simultaneous speakers, and overlapping speech that occurs in over 15% of meeting duration. Master evaluation methodologies for long-form multi-speaker audio processing, including concatenated minimum permutation word error rate (cpWER), and survey essential datasets ranging from AMI to current benchmarks like CHiME-7/8 and NOTSOFAR1. Discover technical approaches categorized into front-end methods such as speech separation, beamforming, and target speaker extraction, alongside back-end methods including self-supervised features, serialized output training, and target-speaker ASR. Understand how large language models are enabling new applications like automated meeting summarization while creating fresh research opportunities. Address key challenges in speaker tracking, training-inference mismatches, and the integration of speech separation, diarization, and recognition components in this active research field with significant potential for advancement.
Syllabus
[camera] Day 7 morning - JSALT 2025 - Burget, Cornell, Masuyama: Robust speech recognition I.
Taught by
Center for Language & Speech Processing(CLSP), JHU