Gain a Splash of New Skills - Coursera+ Annual Just ₹7,999
Launch Your Cybersecurity Career in 6 Months
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to build multimodal, vision-enabled agents using Haystack by combining large language model reasoning with visual understanding capabilities in this 52-minute tutorial led by Bilge Yucel, Developer Advocate at Deepset. Discover how to extend agents with vision-language models to process both images and PDFs, then construct an end-to-end agent capable of answering questions from both textual and visual content. Master the deployment of your multimodal agent using practical tools like Open WebUI and Hayhooks for real-world applications, gaining hands-on experience in creating AI systems that can seamlessly integrate text and visual data processing.
Syllabus
Tutorial: Building Vision-Enabled Agents with Haystack by Deepset
Taught by
Data Science Dojo