Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

What Can a Blind LLM See? - Exploring Visual Perception Through Grid-Based Drawing

echohive via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore how large language models can generate visual representations despite having no visual training through a fascinating experiment involving parallel API calls to create grid-based drawings. Learn about making hundreds of simultaneous requests to LLMs, where each grid square coordinate becomes an individual API call to determine whether that square should be filled or left empty. Discover the methodology behind testing whether blind language models can conceptually "see" and create visual patterns, inspired by research on how models without visual training can still understand spatial relationships. Access downloadable project files to replicate the experiment and understand the technical implementation of coordinating multiple API calls for visual generation tasks.

Syllabus

What can a blind LLM see?

Taught by

echohive

Reviews

Start your review of What Can a Blind LLM See? - Exploring Visual Perception Through Grid-Based Drawing

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.