Robust Speech Recognition II - Day 7 Afternoon

Explore advanced techniques in robust speech recognition through this 4-hour tutorial focusing on diarization conditioned Whisper (DiCoW) target speaker ASR methods. Learn about cutting-edge approaches to automatic speech recognition in conversational settings, beginning with a 20-minute introduction to DiCoW target speaker ASR methodology presented by Lukas Burget from Brno University of Technology. Engage in an extensive 2-hour hands-on laboratory session led by Alexander Polok, also from Brno University of Technology, where you'll gain practical experience implementing target speaker ASR techniques. Discover how to apply ASR models effectively in real-world conversational environments, building upon fundamental speech recognition concepts to tackle more complex scenarios involving multiple speakers and challenging acoustic conditions. This tutorial represents the afternoon portion of Day 7 at JSALT 2025, providing both theoretical foundations and practical implementation skills essential for developing robust speech recognition systems capable of handling speaker-specific recognition tasks in multi-speaker environments.