New Methods to Capture and Exploit Multiscale Speech Dynamics - 2007

Explore new methods for capturing and exploiting multiscale speech dynamics in this 2007 lecture by Patrick Wolfe from Harvard University. Delve into the variability of speech waveforms and the powerful temporal and spectral dynamics that evolve across multiple scales. Learn about advancements in formant estimation using a statistical model-based tracking approach, including a censored likelihood formulation and vector autoregression to model formant cross-correlation. Discover a novel adaptive short-time Fourier analysis-synthesis scheme for speech enhancement, featuring a modified overlap-add procedure for efficient resynthesis. Examine the potential improvements these methods offer over traditional fixed-resolution enhancement systems, supported by measurements and listening tests. Gain insights from Wolfe's extensive background in electrical engineering, statistics, and audio signal processing, and understand the applications of these techniques in high-dimensional data analysis for speech waveforms and color images.