The End of Awkward AI Transcriptions - Building World-Class Speech AI with NVIDIA
AI Engineer via YouTube
Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Discover how NVIDIA is revolutionizing speech AI technology in this 16-minute conference talk that explores the company's breakthrough achievements in automatic speech recognition and conversational AI. Learn about NVIDIA's dominance with 6 top-ten models on the Hugging Face ASR leaderboard and their cutting-edge Parakeet2 model that's setting new global standards for speech AI accuracy and speed. Explore the comprehensive development process behind building world-class conversational AI systems, from open-source research foundations to enterprise-ready NIM microservices that scale seamlessly across any infrastructure. Understand how NVIDIA's developer-focused ecosystem addresses real-world challenges in building call center agents, video dubbing tools, and digital humans through Python-first frameworks, intuitive configurators, and a robust open-source community that enables rapid iteration and seamless integration. Examine real-world enterprise deployments of multilingual, noise-robust, and highly customizable voice agents operating at scale, including digital human blueprints for creating interactive avatars. Gain insights into the underlying conversational AI stack that's transforming customer experience, accessibility, and global communication, and discover why industry leaders consider NVIDIA's speech AI technology a game-changer for the next wave of conversational intelligence.
Syllabus
The End of Awkward AI Transcriptions - Travis Bartley and Myungjong Kim
Taught by
AI Engineer