Building Awesome Speech-to-Text Transformers from Scratch - One Line of PyTorch at a Time

Building Awesome Speech-to-Text Transformers from Scratch - One Line of PyTorch at a Time

Neural Breakdown with AVB via YouTube Direct link

11:38 - MFCCs, and Encoder-Decoder networks

5 of 14

5 of 14

11:38 - MFCCs, and Encoder-Decoder networks

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Building Awesome Speech-to-Text Transformers from Scratch - One Line of PyTorch at a Time

Automatically move to the next video in the Classroom when playback concludes

  1. 1 0:00 - Intro
  2. 2 0:36 - How Audio datasets look like
  3. 3 4:30 - Tokenizing text
  4. 4 9:34 - Data Preprocessing
  5. 5 11:38 - MFCCs, and Encoder-Decoder networks
  6. 6 14:20 - Network Architecture
  7. 7 17:59 - Coding the Convolutional Block
  8. 8 26:40 - Coding attention and Transformers
  9. 9 30:20 - Residual Vector Quantizers
  10. 10 32:57 - Coding RVQs
  11. 11 37:44 - Optimizing RVQs
  12. 12 43:50 - Putting it together
  13. 13 48:50 - Connectionist-Temporal Classification CTC Loss
  14. 14 50:53 - Training!

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.