Speaker Recognition and Diarization

Explore advanced techniques in speaker recognition and diarization through this comprehensive 2-hour lecture from the JSALT 2024 Summer School hosted by Johns Hopkins University's Center for Language & Speech Processing. Delve into the fundamental concepts and cutting-edge methodologies used to identify individual speakers and segment audio recordings based on speaker changes. Learn about the latest developments in speaker embedding techniques, neural network architectures for speaker verification, and state-of-the-art diarization algorithms that automatically determine "who spoke when" in multi-speaker audio scenarios. Gain insights into the challenges and solutions in handling overlapping speech, speaker adaptation, and the integration of speaker recognition systems with automatic speech recognition technologies. Discover practical applications in meeting transcription, broadcast monitoring, and forensic audio analysis while understanding the evaluation metrics and benchmarks used in the field.