Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to generate natural-sounding speech locally using Microsoft's VibeVoice model on AMD's Strix Halo processor in this 13-minute tutorial. Discover how to set up VibeVoice on a Framework Desktop with AMD Ryzen AI Max using a Fedora toolbox and Gradio UI interface. Master single-speaker speech generation, create multi-speaker conversations, and perform zero-shot voice cloning from short audio samples. Explore practical stability fixes for ROCm crashes and troubleshoot common issues with librosa, numba, LLVM, and ROCm components. Follow along as the instructor demonstrates generating full podcast episodes and provides detailed timestamps for setup procedures, demo sessions, and technical troubleshooting. Access comprehensive GitHub resources including toolboxes, scripts, and stability fixes, plus links to the Framework Desktop hardware, Strix Halo homelab guides, VibeVoice project repositories, and Hugging Face model weights for implementing this open-weight speech generation solution.
Syllabus
00:00 — AI-Generated Intro VibeVoice
01:47 — Setup on Strix Halo Toolbox + Gradio
03:28 — First Demo: Single-Speaker
05:18 — Multi-Speaker Conversations
05:42 — Clone Your Own Voice Zero-Shot
06:23 — Stability Fixes librosa / numba / LLVM / ROCm
08:26 — Generating a Full Podcast
09:33 — AI-Generated Podcast: How VibeVoice Works
Taught by
Donato Capitella