Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing
Center for Language & Speech Processing(CLSP), JHU via YouTube
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Watch a 26-minute research presentation from Johns Hopkins University's Center for Language & Speech Processing that introduces Condition-Aware Self-Supervised Learning Representation (CA-SSLR), an innovative conditioning model for speech processing tasks. Learn how this generalist approach integrates language and speaker embeddings from earlier layers to enhance context awareness in SSL models. Explore how CA-SSLR reduces dependency on input audio features while maintaining base SSLR integrity through linear modulation techniques. Discover the model's impressive performance improvements, including a 10% reduction in Language Identification errors, 37% improvement in Automatic Speech Recognition Character Error Rate on ML-SUPERB benchmark, and 27% decrease in Speaker Verification Equal Error Rate on VoxCeleb-1. Understand how this approach minimizes trainable parameters, prevents overfitting, and excels in resource-limited and novel speech processing scenarios.
Syllabus
CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing
Taught by
Center for Language & Speech Processing(CLSP), JHU