Z-Image with Wan 2.2 Animate is my wet dream

Credits to the post OP and Hearmeman98. Used the workflow from this post - https://www.reddit.com/r/StableDiffusion/comments/1ohhg5h/tried_longer_videos_with_wan_22_animate/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

Runpod template link: https://get.runpod.io/wan-template

You just have to deploy the pod (I used A40). Connect to notebook and download huggingface-cli download Kijai/WanVideo_comfy_fp8_scaled Wan22Animate/Wan2_2-Animate-14B_fp8_e5m2_scaled_KJ.safetensors --local-dir /ComfyUI/models/diffusion_models

Before you run it, just make sure you login using huggingface-cli login

Then load the workflow, disable the load image node (on the far right), replace the Talk model with Animate model in the Load Diffusion Model, disconnect the Simple Math nodes from Upload your reference video node and then adjust the frame load cap and skip first frames on what you want to animate. It takes like 8-15 minutes for 1 video (depending on the frames you want)

I just found out what Wan 2.2 animate can do yesterday lol. OMG this is just so cool. Generating an image using ZIT and just doing all kinds of weird videos haha. Yes, obviously I did a few science projects last night as soon as I got the workflow working

Its not perfect, I am still trying to understand the whole workflow, how to tweak things and how to generate images with the composition I want so the video has less glitches but i am happy with the results going in as a noob to video gen

11 points Major_Specific_23

Some Z Image generations here

https://preview.redd.it/mg4ig0t9886g1.png?width=1920&format=png&auto=webp&s=827b45f979fadd987d70854eabdb3960508f2c40

parent

6 points Major_Specific_23

https://preview.redd.it/oq2txszc886g1.png?width=1536&format=png&auto=webp&s=05227c9f9997ebb3226332fe1fe8d10ed8e3872a

parent root

3 points candycumslutxx

How did you get this image of her? If I try to prompt her, a totally different looking woman gets generated.

parent root

13 points Major_Specific_23

use lora from https://www.reddit.com/r/malcolmrey/

a legend when it comes to celeb loras

parent root

2 points candycumslutxx

Thank you so much, I appreciate it! I had no idea these existed.

parent root

3 points candycumslutxx

I apologize in advance for asking so many questions but is there anything I need to pay special attention to, when using these Loras? A certain workflow? I just tried to recreate your image, played with the weight a bit and it doesn't look bad at all but nowhere near as realistic as yours. May I ask what your prompt or secret is? :D

parent root

2 points Major_Specific_23

ahmm you can start here - https://www.reddit.com/r/StableDiffusion/comments/1paegb2/my_4_stage_upscale_workflow_to_squeeze_every_drop/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

The images i am showing here are generated using an updated workflow. you can call it as v2 of my 4 stage workflow from the above link. the new one is tuned to work well with other lora's and controlnet but prompt adherence takes a little hit. its still WIP. i will post it once its ready. until then just experiment with EasyCache node and Latent upscale

parent root

1 points candycumslutxx

You are incredible! Thank you so so much! 🫶🏼

parent root

1 points Major_Specific_23

Prompt:
Sydney Sweeney sitting near a window in soft light

<think> This name evokes a modern, urban identity with Middle Eastern-European heritage. Likely minimalistic fashion, academic-creative profession, poised posture. </think> <think> 这个名字带有中东与欧洲的混血背景，给人一种优雅、沉静但不浮夸的气质。结合她是建筑系学生，可以推测她穿着简洁、注重线条比例。 </think> <think> She's sitting on a cushioned bench near a wooden panel wall. There's a marble café table nearby. Compositionally, she's placed in warm directional lighting. </think> <think> 她坐在一个靠窗的沙发位上，背景是一面深棕色的墙体和木质护墙板。自然光从右侧窗户洒入，勾勒出她面部和肩颈的轮廓。 </think> <think> Her skin is very light with neutral-warm undertones, catching golden side light. Hair is dark brown, long, softly curled at the ends. No bangs. </think> <think> 她的肤色非常白皙，偏暖调，在阳光下带有微微的金色反射。头发是深棕色，自然下垂，没有刘海，发尾略微弯曲。 </think> <think> Outfit: sleeveless black halter top, high-waisted beige mini skirt. Fitted but elegant. Minimal accessories. </think> <think> 她穿着黑色无袖高领上衣，搭配高腰米白色短裙。衣物线条清晰利落，展现身材但不夸张。身上没有明显配饰。 </think> <think> Her pose is calm and aware — one hand gently resting across her lap, the other on the cushion beside her. Slight rotation of torso toward the window. </think> <think> 她的姿势自然且带有克制的优雅，一只手放在腿上，另一只手搭在沙发靠垫上。身体略微朝向窗外，呈现出一种凝视光源的姿态。 </think> <think> Light source is soft daylight, likely golden hour. The interplay of shadow on her left arm and cheek adds depth. Scene lacks artificial light — all natural tone. </think> <think> 光源是自然光，估计是傍晚黄金时段。她的左臂与脸颊有轻柔的阴影过渡，画面没有任何人工光，整体色调温暖、柔和。 </think> <think> No visible signage. Bag rests near her side, black leather with woven texture. Table is round, white marble. Cushions behind her are in muted tones. </think> <think> 画面中没有任何文字元素。她的包是黑色皮革材质，有编织纹理，放在身侧。咖啡桌是圆形白色大理石材质，沙发上有几只浅灰与米色的抱枕。 </think> <think> Framing is medium-close, camera at eye-level. Image has slight mobile softness in contrast, but shadows are clean. The mood feels painterly and still. </think> <think> 构图为中近景，视角与人物视线持平。照片整体对比度稍低，有轻微的手机拍摄柔光感，但阴影边缘清晰，整体氛围有种油画般的静谧感。 </think>

parent root

2 points theqmann

What's with the think tags?

parent root

1 points Major_Specific_23

nothing. just throwing a bunch of stuff to see which one sticks. all experimentation

parent root

2 points FourtyMichaelMichael

🔥

parent root

14 points Major_Specific_23

https://preview.redd.it/q0fajzfe886g1.png?width=1920&format=png&auto=webp&s=345588eb66be0d94d009d68af6ff3227e050edf5

parent root

5 points Major_Specific_23

https://preview.redd.it/bfzu49zb886g1.png?width=1920&format=png&auto=webp&s=ae145317de6496241e99f31f3dc59eef8076a0d6

parent root

4 points Major_Specific_23

https://preview.redd.it/xi03x0zg886g1.png?width=1536&format=png&auto=webp&s=c8c84cd6e1309b43f9b80f91fac81b1769b2b16b

parent root

4 points Any_Tea_3499

What’s your prompt for this image (the guy at the bar)?

parent root

3 points Major_Specific_23

https://preview.redd.it/19tudvta886g1.png?width=1920&format=png&auto=webp&s=0b6c2c938f17403f41cbb0e605d02f4566617e23

parent root

3 points Major_Specific_23

https://preview.redd.it/e9v487mf886g1.png?width=1536&format=png&auto=webp&s=c574ac7c2bad9888c33e814c9b9e040cdf33ebc4

parent root
8 points Nokai77

Can you share the workflow outside of RunPod? How much VRAM do you need?

parent

10 points Major_Specific_23

what i noticed is that it uses ~22 gb of vram. the workflow is in the reddit post i added in the body. there is a direct link there from the op

EDIT: just tested it. its going to 35 gb also

parent root
2 points yupignome

how was the audio done? s2v with wan 2.2 or something else? which workflow did you use to sync the audio to the video?

parent

7 points Major_Specific_23

if the video i used a reference has audio, the workflow automatically adds it to the generated video in sync. how freaking cool is that?

parent root

2 points OlivencaENossa

wow, this is great. Is this from Hermanmans workflows? Im on his discord but dont follow it all

parent root

2 points Major_Specific_23

I am not so sure if its him who added it or the OP of the post that i linked in the body of this post.

parent root
2 points Hefty_Development813

How long of clips can you do with quality? I want like a couple minutes, but quality degrades a lot

parent

4 points Major_Specific_23

the max i tried it 196 frames and the quality is top notch. it just takes wayyy too long on a40

parent root

2 points OlivencaENossa

how long does it take

parent root

4 points Major_Specific_23

30 minutes give or take

parent root

3 points grmndzr

lol long is relative, 30 min for 196 frames is killer

parent root

2 points Ok-Page5607

Are you using the standard settings in the workflow above for "top notch" quality?

parent root

2 points Major_Specific_23

yeah mostly default. like i said i don't fully understand it yet so i just use default settings with only the changes that i highlighted in the body of this post

parent root

2 points Ok-Page5607

allright, I will test it tomorrow. What resolution do you use to generate the videos? And what GPU? I've only used Wan Animate myself twice or so. The results were quite good, but the skin and face were a bit muddy

parent root

2 points Major_Specific_23

skin problems? this is where Z shines :D

parent root

1 points Ok-Page5607

I just used it with qwen. do you mean the input image quality can significally improve the skin quality in these videos? I mean the image wasn‘t too bad

parent root

1 points Major_Specific_23

Yes! it is what i noticed. the NSFW videos i generated have so much better skin in the videos than the SFW videos

parent root

1 points Ok-Page5607

definitely zimg. best for skin indeed

parent root

1 points Major_Specific_23

dang reddit compresses it and the website i used to stitch the videos together also degraded the quality haha. in vlc it looks crisp af lol

parent root

1 points Ok-Page5607

it still looks very good!

parent root
1 points patiperro_v3

I can spot a fellow Spanish Chilean accent from a mile away. Was that generated as well or was it a random sample to generate the gorilla from?

parent

1 points Major_Specific_23

i was browsing through insta and i saw that post. she is talking about being hydrated and voting right? i thought ok why not let a yeti talk about it :)

parent root

1 points patiperro_v3

Gotcha, that makes sense. I figured it would be easier to generate a yeti than to recreate a believable Chilean accent. AI is not there yet.

parent root
1 points Pretty_Molasses_3482

Oye qué onda el weon chileno haciendo WeoN 2.2? Ah? Te caché!

parent
1 points Thistleknot

I've been trying w little to no success w removing glare and artifacts around eyes

using 5b tho

parent
1 points Stunning_Second_6968

How to run it on my Rx 9060 XT 16gb ?

parent
1 points BitterAd6419

Bro wtf great work

parent
1 points vqh0410

Can RTX 4060 ti run?

parent
1 points soldture

That badge tho :D

parent
-7 points Kraien

Aww look at the sexual harassment panda living it up!

parent