Completed
24:13 - Improving video usability with variable FPS and frame tokenization
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Gemini's Multimodal Capabilities - Deep Dive into Native Multimodality and AI Vision
Automatically move to the next video in the Classroom when playback concludes
- 1 0:00 - Intro
- 2 1:12 - Why Gemini is natively multimodal
- 3 2:23 - The technology behind multimodal models
- 4 5:15 - Video understanding with Gemini 2.5
- 5 9:25 - Deciding what to build next
- 6 13:23 - Building new product experiences with multimodal AI
- 7 17:15 - The vision for proactive assistants
- 8 24:13 - Improving video usability with variable FPS and frame tokenization
- 9 27:35 - What’s next for Gemini’s multimodal development
- 10 31:47 - Deep dive on Gemini’s document understanding capabilities
- 11 37:56 - The teamwork and collaboration behind Gemini
- 12 40:56 - What’s next with model behavior