Completed
0:00 Google GenAI Python SDK 1.0 and Gemini 2.0 Flash
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Multimodal Data Extraction with Gemini 2.0 Flash and Google GenAI Python SDK
Automatically move to the next video in the Classroom when playback concludes
- 1 0:00 Google GenAI Python SDK 1.0 and Gemini 2.0 Flash
- 2 0:31 Gemini 2.0 Flash is cheap, multi-modal, and has a large context window
- 3 2:00 Project #1 - Extract themes and companies from 126 page Citrini Research PDF
- 4 4:00 Project #2 - Extract predictions from Dylan Patel on Lex Fridman podcast
- 5 4:57 Why is information from multimodal sources interesting?
- 6 6:28 Project #3 - Backtest predictions from financial gurus on YouTube
- 7 7:04 Get the Python code on my Github
- 8 7:44 Project setup, virtual environment, packages
- 9 8:45 Getting a Gemini API Key, setting the environment variable
- 10 9:39 Python code - Image understanding of an IPO Pulse image
- 11 14:20 Python code - Structured extraction of trade themes from a Substack report
- 12 21:27 Python code - Can we do structured extraction on a 5 hour podcast in one API call?
- 13 23:53 I still recommend chunking in cases like this
- 14 24:48 Shell script - using ffmpeg to split audio files into slices for better results
- 15 25:52 Python code - processing a directory of audio files for better structured extraction
- 16 26:25 Provocative predictions extracted from the podcast
- 17 28:03 The Gemini app can watch Youtube videos and extract information
- 18 29:36 Sometimes you can see the future by watching what developers are doing
- 19 30:09 Code to process a YouTube video with the API
- 20 31:03 Conclusion - Gemini 2.0 Flash is worth it!