Learn AI, Data Science & Business — Earn Certificates That Get You Hired
The Fastest Way to Become a Backend Developer Online
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore how PaliGemma enhances Gemma 2 with visual capabilities in this 11-minute Google talk. Learn about the integration of a SigLIP vision encoder that enables pre-training on multiple vision tasks including captioning, question answering, object detection, and segmentation. Discover how adjusting image resolution and model size provides flexibility in computational requirements, scaling compute by a factor of 155. The talk, presented by Andreas Steiner, highlights how fine-tuning PaliGemma with your own data can yield excellent performance, particularly for text-related tasks, making it a valuable multimodal extension to the Gemma model family.
Syllabus
PaliGemma – Making Gemma 2 see by adding a vision encoder
Taught by
Google Developers