Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore how large language models can generate visual representations despite having no visual training through a fascinating experiment involving parallel API calls to create grid-based drawings. Learn about making hundreds of simultaneous requests to LLMs, where each grid square coordinate becomes an individual API call to determine whether that square should be filled or left empty. Discover the methodology behind testing whether blind language models can conceptually "see" and create visual patterns, inspired by research on how models without visual training can still understand spatial relationships. Access downloadable project files to replicate the experiment and understand the technical implementation of coordinating multiple API calls for visual generation tasks.
Syllabus
What can a blind LLM see?
Taught by
echohive