Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Milliseconds to Magic - Real-Time Workflows using the Gemini Live API and Pipecat

AI Engineer via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the capabilities of Google's Gemini Live API powered by Gemini 2.5 Flash in this 22-minute conference talk from the AI Engineer World's Fair. Dive deep into how the Gemini Live API combined with Pipecat unlocks powerful real-time multimodal capabilities for developers, with special focus on session management, turn detection, tool use including async function calls, proactivity, multilinguality, and integration with telephony and other infrastructure. Witness innovative demonstrations showcasing these capabilities and learn about customer use cases including how Pipecat extends real-time multimodal features to client-side applications such as customer support agents, gaming agents, and tutoring agents. Discover Google's experimental native audio offering that enables seamless, emotive, steerable, multilingual dialogue for use cases where natural voices provide significant differentiation. Gain insights from Kwindla Kramer, creator of the open-source Pipecat voice agent framework and WebRTC infrastructure expert at Daily, alongside Shrestha Basu Mallick, Group Product Manager and product lead for Gemini API at Google DeepMind, who brings extensive experience in AI assistance and product development across Google's coding surfaces.

Syllabus

Milliseconds to Magic: Real‑Time Workflows using the Gemini Live API and Pipecat

Taught by

AI Engineer

Reviews

Start your review of Milliseconds to Magic - Real-Time Workflows using the Gemini Live API and Pipecat

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.