Gemini TTS - Native Audio Out: Using Text-to-Speech for Speech and Dialogue
Sam Witteveen via YouTube
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This tutorial explores the Gemini TTS (Text-to-Speech) technology released at Google I/O, demonstrating how to implement speech and dialogue capabilities. Learn about the new Gemini 2.5 Speech Generation features through practical demonstrations in Google AI Studio and Colab notebooks. Follow along with single speaker implementations and see how to create multi-speaker podcast-style content using the native audio output capabilities. The video includes step-by-step code examples and explanations of the technology's potential applications. Perfect for developers interested in implementing advanced text-to-speech functionality in their projects or anyone curious about the latest advancements in AI-generated speech technology.
Syllabus
00:00 Intro
00:28 New Gemini 2.5 Speech Generation Text-to-Speech
01:44 Google AI Studio: Native Speech Generation
02:37 Colab Demo: Single Speaker
08:51 Colab Demo: Multi-Speaker Podcast
Taught by
Sam Witteveen
Reviews
5.0 rating, based on 1 Class Central review
Showing Class Central Sort
-
That is so helpful for a beginner, it is a basic learning about text to speech by providing a good introduction with examples, and clear explanation. Thank you so much.