Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the cutting-edge multimodal capabilities of OpenAI's latest technologies in this 24-minute conference talk delivered by Romain Huet, Head of Developer Relations at OpenAI, at the AI Engineer World's Fair in San Francisco. Discover the future of AI through comprehensive demonstrations of GPT-4o Omnimodel Voice, ChatGPT Desktop, Sora video generation, and Voice Engine technologies, all showcased in a single presentation. Learn how these revolutionary tools are transforming the landscape from traditional text-based interactions to sophisticated voice and vision capabilities. Gain insights into the practical applications and potential of multimodal AI systems that seamlessly integrate text, visual, and audio processing. Understand the technical innovations behind OpenAI's omnimodel approach and how these advancements are shaping the next generation of AI-powered applications and user experiences.
Syllabus
From Text to Vision to Voice Exploring Multimodality with Open AI: Romain Huet
Taught by
AI Engineer