Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

University of Central Florida

Generative Video LLMs - Planning Agents and Multimodal Composition

University of Central Florida via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the cutting-edge intersection of generative artificial intelligence and video understanding in this keynote presentation that delves into how large language models are being adapted for video generation, planning agent capabilities, and multimodal composition tasks. Learn about the latest research developments in generative video LLMs from a leading expert who bridges academic research at the University of North Carolina with industry applications at Amazon. Discover how these advanced models can understand, generate, and manipulate video content while incorporating planning mechanisms that enable autonomous agent behavior. Examine the technical challenges and breakthroughs in multimodal composition, where text, visual, and temporal elements are seamlessly integrated to create sophisticated video content. Gain insights into the current state of the field, emerging applications, and future directions for generative video technologies that combine natural language processing with computer vision and temporal reasoning.

Syllabus

Keynote Talk 5: Mohit Bansal, UNC & Amazon

Taught by

UCF CRCV

Reviews

Start your review of Generative Video LLMs - Planning Agents and Multimodal Composition

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.