Dia 1.6B TTS for NotebookLM Podcasts - Exploring Text-to-Speech Technology
Sam Witteveen via YouTube
Finance Certifications Goldman Sachs & Amazon Teams Trust
MIT Sloan: Lead AI Adoption Across Your Organization — Not Just Pilot It
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
This 13-minute tutorial video explores Dia, a new text-to-speech (TTS) system developed by Nari Labs, and demonstrates how it can be used to create podcasts similar to NotebookLM. Follow along as Sam Witteveen examines various articles about Dia from TechCrunch, VentureBeat, and Hacker News, before exploring the Nari Labs website and relevant research papers like SoundStorm and Parakeet. Learn about the Google TPU Research Cloud that supported this project, and watch a practical demonstration using Colab to implement the Dia 1.6B model available on Hugging Face. The video includes links to all necessary resources including the Colab notebook, Hugging Face repository, and GitHub page for those wanting to experiment with this TTS technology themselves.
Syllabus
00:00 Intro / TechCrunch Article
00:13 Venturbeat Article
00:25 Hacker News
00:37 Nari Labs Site
01:07 Toby Kim Tweet or X Post
01:33 SoundStorm Paper
01:52 Parakeet
02:21 Google TPU Research Cloud
02:52 Dia 1.5B Hugging Face Space
03:31 Colab Demo
Taught by
Sam Witteveen