Overview
Syllabus
0:00 Google GenAI Python SDK 1.0 and Gemini 2.0 Flash
0:31 Gemini 2.0 Flash is cheap, multi-modal, and has a large context window
2:00 Project #1 - Extract themes and companies from 126 page Citrini Research PDF
4:00 Project #2 - Extract predictions from Dylan Patel on Lex Fridman podcast
4:57 Why is information from multimodal sources interesting?
6:28 Project #3 - Backtest predictions from financial gurus on YouTube
7:04 Get the Python code on my Github
7:44 Project setup, virtual environment, packages
8:45 Getting a Gemini API Key, setting the environment variable
9:39 Python code - Image understanding of an IPO Pulse image
14:20 Python code - Structured extraction of trade themes from a Substack report
21:27 Python code - Can we do structured extraction on a 5 hour podcast in one API call?
23:53 I still recommend chunking in cases like this
24:48 Shell script - using ffmpeg to split audio files into slices for better results
25:52 Python code - processing a directory of audio files for better structured extraction
26:25 Provocative predictions extracted from the podcast
28:03 The Gemini app can watch Youtube videos and extract information
29:36 Sometimes you can see the future by watching what developers are doing
30:09 Code to process a YouTube video with the API
31:03 Conclusion - Gemini 2.0 Flash is worth it!
Taught by
Part Time Larry