Our career paths help you become job ready faster
AI Engineer - Learn how to integrate AI into software applications
Overview
Syllabus
0:00 Fine-tuning Text-to-Speech Models with Unsloth
0:53 Video Overview
1:47 Video Resources
2:26 Voice Quality Examples: ElevenLabs vs Open Source
4:52 The recipe for professional quality voice cloning
6:23 How do token based speech to text models?
14:08 Data Preparation and Training Overview
16:02 Data preparation, cleaning and chunking for voice cloning
24:05 Audio transcription from uploaded audio
25:42 Dataset chunking and pushing to HuggingFace Hub
29:49 Loading Sesame CSM-1B and LoRA adapters full fine-tuning also possible! And in the repo
34:36 Dataset loading and creating and eval split
37:42 Training Hyperparameters
40:08 Running inference on the fine-tuned model, and evaluating
43:57 LoRA fine-tuning of Orpheus by Canopy Labs - Data loading and is very different!
50:27 Running inference and Listening to the quality with Orpheus
53:15 Professional Voice Cloning with Eleven Labs
56:18 Examining tensorboard logs from the Sesame LoRA fine-tuning
57:27 Upcoming video on serving Orpheus with vLLM
58:10 Conclusion
Taught by
Trelis Research