Visual Question Answering: Grounded Systems and Transformer Capsules
University of Central Florida via YouTube
Google AI Professional Certificate - Learn AI Skills That Get You Hired
You’re only 3 weeks away from a new language
Overview
Syllabus
Intro
Grounded Visual Question Answering
Limitations of Existing VQA Systems
Grounded VQA Systems
Problem Setup
Transformers with Capsules
Approach
Capsule-based Tokens
Input to Intermediate Transformer layers
Text-based Residual Connection
Pre-training Tasks
Masked Language Modeling (MLM)
Image Text Matching
Pre-training Datasets
Fine-tuning on Downstream Task
Qualitative comparison - GQA
Evaluation Metrics
Results - GQA
Conclusion and Future Work
Taught by
UCF CRCV