Overview
Syllabus
0:00 Is it Wan like “Anne” or “won”?
0:55 The Wan suite of models
1:10 Wan 2.1’s model architecture and research paper
3:50 Wan 2.2 video improvements from Wan 2.1
5:35 Our fine-tuning goal: Conan O’Brien interviewing Will Smith who’s wearing a Denver Broncos shirt
7:30 Base model results
8:55 Wan 2.2’s model architecture
12:55 Fine-tuning: How we created our data
17:12 Fine-tuning: How we fine-tuned each Wan model
19:22 Question: How many images do you need?
20:24 Question: Did we use musubi-tuner?
20:40 Question: How to train camera panning
22:45 Fine-tuning: Comparing images as we fine-tune
29:37 Bringing our Will Smith fine-tuned model to Comfyui
42:00 Configuring Comfyui to run our fine-tuned model
47:28 Question: Does the image input format matter?
48:40 Loading our Conan O’Brien fine-tuned model on Comfyui
57:45 Question: How are the LoRAs loaded into the pipeline
58:40 Final Results: Conan interviewing Will Smith
Taught by
Oxen