Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore the technical foundations and scaling strategies for building AI agent swarms capable of handling real-time voice interactions with sub-second latency in this comprehensive conference talk. Learn best practices for implementing multimodal AI agents that integrate low-latency speech-to-text processing, large language model RAG (Retrieval-Augmented Generation), fine-tuning techniques, and text-to-speech synthesis. Discover how to architect systems that can scale from individual conversations to supporting thousands or millions of concurrent voice interactions simultaneously. Gain hands-on insights through open-source code demonstrations showing the complete pipeline for creating conversational AI agents with end-to-end latency optimization. Master the technical considerations for speech processing, model integration, and infrastructure scaling while understanding the practical challenges of deploying real-time voice AI systems at enterprise scale. Access provided GitHub repositories containing both client-side and server-side implementations to build upon the concepts presented, with requirements including basic web development tools and audio hardware for testing voice interactions.
Syllabus
Building and Scaling an AI Agent Swarm of low latency real time voice bots: Damien Murphy
Taught by
AI Engineer