AI Engineer - Learn how to integrate AI into software applications
AI, Data Science & Cloud Certificates from Google, IBM & Meta
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
This video explores visual reasoning capabilities in AI systems, examining both the latest research algorithms and real-world applications. Dive into an analysis of "Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning," a paper published by researchers from Peking University, Beijing Academy of Artificial Intelligence, Chinese Academy of Sciences, and University of Chinese Academy of Sciences. Learn about the current limitations of visual reasoning in Vision Language Models (VLMs) through personal experiences with commercial AI systems. The 22-minute presentation provides insights into the gap between research claims and practical performance of visual AI reasoning technologies.
Syllabus
Failure of AI "Visual Reasoning" in VLMs
Taught by
Discover AI