DIY OpenAI Vision API App with Speech Recognition - Python, OpenAI, Google Speech Services
Eli the Computer Guy via YouTube
Start speaking a new language. It’s just 3 weeks away.
Stuck in Tutorial Hell? Learn Backend Dev the Right Way
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn to build an OpenAI Vision API application with speech recognition capabilities using Python, OpenAI, and Google Speech Services in this comprehensive 41-minute tutorial. Explore system architecture, automatic item identification, and full voice communication with a computer vision system. Gain practical insights into code implementation, including handling Pyaudio challenges. Follow along with detailed code explanations and demonstrations to create your own AI-powered vision and speech application.
Syllabus
Introduction
Demonstration
System Architecture
WARNING - Pyaudio is a pain
Automatic Item Identification Script - Code Explaination
Ask Computer About an Item - Code Explanation
Full Voice Communication with a Computer Vision System - Code Explanation
Final Thoughts
Taught by
Eli the Computer Guy