Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Learn to set up and run advanced image and video generation models on AMD Radeon R9700 AI PRO hardware using ComfyUI and ROCm. Explore the complete software configuration process, including a custom Docker-based toolbox and model management scripts designed for this specific GPU architecture. Master the implementation of multiple state-of-the-art diffusion models including Qwen Image for both generation and editing tasks, Wan 2.2 for video creation, and Hunyuan 1.5 for advanced video synthesis. Practice running various workflows with different quantization settings and LoRAs to optimize performance on the 32GB VRAM configuration. Discover techniques for text-to-image generation, image editing with multiple input sources, text-to-video creation, and image-to-video conversion across different model architectures. Understand how to benchmark performance across different step counts and acceleration methods, from standard 20-step workflows to optimized 4-step Lightning LoRA implementations. Gain practical experience with ROCm performance logging and learn how to contribute performance data to the development community for hardware optimization improvements.
Syllabus
Introduction
Setting Up the Toolbox & Model Manager
Qwen Image Standard 20 Steps
Qwen Image 4-Step Lightning LoRA
Qwen Image Edit Standard 20 Steps
Qwen Image Edit 4-Step Lightning LoRA
Using Multiple Input Images Person + Clothes
Text-to-Video: Wan 2.2 4-Step LoRA
Image-to-Video: Wan 2.2 4-Step LoRA
Text-to-Video: Hunyuan 1.5 4-Step LoRA
Image-to-Video: Hunyuan 1.5 4-Step LoRA
Improving ROCm Performance Logs
Conclusion
Taught by
Donato Capitella