Professional Quality Voice Cloning - Open Source vs ElevenLabs

Professional Quality Voice Cloning - Open Source vs ElevenLabs

Trelis Research via YouTube Direct link

0:00 Fine-tuning Text-to-Speech Models with Unsloth

1 of 20

1 of 20

0:00 Fine-tuning Text-to-Speech Models with Unsloth

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Professional Quality Voice Cloning - Open Source vs ElevenLabs

Automatically move to the next video in the Classroom when playback concludes

  1. 1 0:00 Fine-tuning Text-to-Speech Models with Unsloth
  2. 2 0:53 Video Overview
  3. 3 1:47 Video Resources
  4. 4 2:26 Voice Quality Examples: ElevenLabs vs Open Source
  5. 5 4:52 The recipe for professional quality voice cloning
  6. 6 6:23 How do token based speech to text models?
  7. 7 14:08 Data Preparation and Training Overview
  8. 8 16:02 Data preparation, cleaning and chunking for voice cloning
  9. 9 24:05 Audio transcription from uploaded audio
  10. 10 25:42 Dataset chunking and pushing to HuggingFace Hub
  11. 11 29:49 Loading Sesame CSM-1B and LoRA adapters full fine-tuning also possible! And in the repo
  12. 12 34:36 Dataset loading and creating and eval split
  13. 13 37:42 Training Hyperparameters
  14. 14 40:08 Running inference on the fine-tuned model, and evaluating
  15. 15 43:57 LoRA fine-tuning of Orpheus by Canopy Labs - Data loading and is very different!
  16. 16 50:27 Running inference and Listening to the quality with Orpheus
  17. 17 53:15 Professional Voice Cloning with Eleven Labs
  18. 18 56:18 Examining tensorboard logs from the Sesame LoRA fine-tuning
  19. 19 57:27 Upcoming video on serving Orpheus with vLLM
  20. 20 58:10 Conclusion

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.