New Directions in Robust Automatic Speech Recognition - 2005

Explore new directions in robust automatic speech recognition in this 1-hour 21-minute lecture by Richard Stern from Carnegie Mellon University. Delve into classical and contemporary approaches for improving speech recognition technology as it transitions from laboratories to real-world applications. Examine techniques for addressing environmental degradation caused by quasi-stationary additive noise and linear filtering, including cepstral high-pass filtering, cepstral mean normalization, and RASTA filtering. Investigate statistical modeling methods like codeword-dependent cepstral normalization and vector Taylor series expansion. Learn why these approaches fall short when dealing with transient or non-stationary noise such as background music or speech. Discover advanced techniques for handling challenging acoustic environments, including missing-feature compensation, multi-band analysis, feature combination, and physiologically-motivated auditory scene analysis. Gain insights into the evolving field of robust speech recognition and its importance in practical applications.