Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore Google Deepmind's Gemma 3 multimodal model family in this 21-minute tutorial that tests its capabilities in a local environment using Ollama. Learn how this multilingual model with 128k context window performs across various tasks including coding, data labeling, text summarization, structured data extraction (including vision capabilities), and RAG-based question answering. The video walks through practical examples from generating hip-hop lyrics to extracting information from tables, providing a comprehensive assessment of the model's strengths and limitations. Perfect for developers and AI enthusiasts wanting to understand how Gemma 3's different sizes (1B, 4B, 12B, and 27B) can be implemented locally for various applications.
Syllabus
Ollama model: https://ollama.com/library/gemma3:12b
00:00 - Gemma 3 overview
03:30 - Ollama model
03:42 - Notebook setup
04:53 - Hip Hop lyrics
06:06 - Coding
09:26 - Data labeling
11:10 - Text summarization
12:30 - LinkedIn post
13:56 - Structured data extraction with vision test
17:06 - RAG/Question-answering
18:52 - Table data extraction
19:42 - Conclusion
Taught by
Venelin Valkov