DIY OpenAI Vision API App with Speech Recognition - Python, OpenAI, Google Speech Services
Eli the Computer Guy via YouTube
PowerBI Data Analyst - Create visualizations and dashboards from scratch
Build the Finance Skills That Lead to Promotions — Not Just Certificates
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn to build an OpenAI Vision API application with speech recognition capabilities using Python, OpenAI, and Google Speech Services in this comprehensive 41-minute tutorial. Explore system architecture, automatic item identification, and full voice communication with a computer vision system. Gain practical insights into code implementation, including handling Pyaudio challenges. Follow along with detailed code explanations and demonstrations to create your own AI-powered vision and speech application.
Syllabus
Introduction
Demonstration
System Architecture
WARNING - Pyaudio is a pain
Automatic Item Identification Script - Code Explaination
Ask Computer About an Item - Code Explanation
Full Voice Communication with a Computer Vision System - Code Explanation
Final Thoughts
Taught by
Eli the Computer Guy