Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Speaker Diarization - From Modular to End-to-End Systems - Day 3 Morning

Center for Language & Speech Processing(CLSP), JHU via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the fundamentals and advanced techniques of speaker diarization in this comprehensive lecture delivered by Federico Landini from Deepgram at JSALT 2025. Learn about the essential task of determining "who spoke when" in audio recordings, starting with traditional modular systems based on clustering approaches and progressing to modern end-to-end systems where single neural networks process audio and generate direct outputs. Discover the evolution from classical methods to cutting-edge neural models including VBx, EEND (End-to-End Neural Diarization), and DiaPer systems. Gain insights from an expert who has contributed significantly to both modular and end-to-end diarization approaches, led successful teams in DIHARD and VoxSRC challenges, and has extensive industry experience from internships at major tech companies including Meta, Facebook, Apple, and Microsoft. Understand the practical applications and implementation strategies for speaker diarization systems, with emphasis on open-source recipes and models that advance the field of speech processing and audio analysis.

Syllabus

[camera] Day 3 morning - JSALT 2025 - Landini: Speaker Diarization

Taught by

Center for Language & Speech Processing(CLSP), JHU

Reviews

Start your review of Speaker Diarization - From Modular to End-to-End Systems - Day 3 Morning

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.