Completed
37:44 - Optimizing RVQs
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Building Awesome Speech-to-Text Transformers from Scratch - One Line of PyTorch at a Time
Automatically move to the next video in the Classroom when playback concludes
- 1 0:00 - Intro
- 2 0:36 - How Audio datasets look like
- 3 4:30 - Tokenizing text
- 4 9:34 - Data Preprocessing
- 5 11:38 - MFCCs, and Encoder-Decoder networks
- 6 14:20 - Network Architecture
- 7 17:59 - Coding the Convolutional Block
- 8 26:40 - Coding attention and Transformers
- 9 30:20 - Residual Vector Quantizers
- 10 32:57 - Coding RVQs
- 11 37:44 - Optimizing RVQs
- 12 43:50 - Putting it together
- 13 48:50 - Connectionist-Temporal Classification CTC Loss
- 14 50:53 - Training!