Credits to the post OP and Hearmeman98. Used the workflow from this post - https://www.reddit.com/r/StableDiffusion/comments/1ohhg5h/tried_longer_videos_with_wan_22_animate/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
Runpod template link: https://get.runpod.io/wan-template
You just have to deploy the pod (I used A40). Connect to notebook and download huggingface-cli download Kijai/WanVideo_comfy_fp8_scaled Wan22Animate/Wan2_2-Animate-14B_fp8_e5m2_scaled_KJ.safetensors --local-dir /ComfyUI/models/diffusion_models
Before you run it, just make sure you login using huggingface-cli login
Then load the workflow, disable the load image node (on the far right), replace the Talk model with Animate model in the Load Diffusion Model, disconnect the Simple Math nodes from Upload your reference video node and then adjust the frame load cap and skip first frames on what you want to animate. It takes like 8-15 minutes for 1 video (depending on the frames you want)
I just found out what Wan 2.2 animate can do yesterday lol. OMG this is just so cool. Generating an image using ZIT and just doing all kinds of weird videos haha. Yes, obviously I did a few science projects last night as soon as I got the workflow working
Its not perfect, I am still trying to understand the whole workflow, how to tweak things and how to generate images with the composition I want so the video has less glitches but i am happy with the results going in as a noob to video gen
Some Z Image generations here
https://preview.redd.it/mg4ig0t9886g1.png?width=1920&format=png&auto=webp&s=827b45f979fadd987d70854eabdb3960508f2c40
https://preview.redd.it/oq2txszc886g1.png?width=1536&format=png&auto=webp&s=05227c9f9997ebb3226332fe1fe8d10ed8e3872a
How did you get this image of her? If I try to prompt her, a totally different looking woman gets generated.
use lora from https://www.reddit.com/r/malcolmrey/
a legend when it comes to celeb loras
Thank you so much, I appreciate it! I had no idea these existed.
I apologize in advance for asking so many questions but is there anything I need to pay special attention to, when using these Loras? A certain workflow? I just tried to recreate your image, played with the weight a bit and it doesn't look bad at all but nowhere near as realistic as yours. May I ask what your prompt or secret is? :D
ahmm you can start here - https://www.reddit.com/r/StableDiffusion/comments/1paegb2/my_4_stage_upscale_workflow_to_squeeze_every_drop/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
The images i am showing here are generated using an updated workflow. you can call it as v2 of my 4 stage workflow from the above link. the new one is tuned to work well with other lora's and controlnet but prompt adherence takes a little hit. its still WIP. i will post it once its ready. until then just experiment with EasyCache node and Latent upscale
You are incredible! Thank you so so much! 🫶🏼
Prompt:
Sydney Sweeney sitting near a window in soft light
<think> This name evokes a modern, urban identity with Middle Eastern-European heritage. Likely minimalistic fashion, academic-creative profession, poised posture. </think> <think> 这个名字带有中东与欧洲的混血背景,给人一种优雅、沉静但不浮夸的气质。结合她是建筑系学生,可以推测她穿着简洁、注重线条比例。 </think> <think> She's sitting on a cushioned bench near a wooden panel wall. There's a marble café table nearby. Compositionally, she's placed in warm directional lighting. </think> <think> 她坐在一个靠窗的沙发位上,背景是一面深棕色的墙体和木质护墙板。自然光从右侧窗户洒入,勾勒出她面部和肩颈的轮廓。 </think> <think> Her skin is very light with neutral-warm undertones, catching golden side light. Hair is dark brown, long, softly curled at the ends. No bangs. </think> <think> 她的肤色非常白皙,偏暖调,在阳光下带有微微的金色反射。头发是深棕色,自然下垂,没有刘海,发尾略微弯曲。 </think> <think> Outfit: sleeveless black halter top, high-waisted beige mini skirt. Fitted but elegant. Minimal accessories. </think> <think> 她穿着黑色无袖高领上衣,搭配高腰米白色短裙。衣物线条清晰利落,展现身材但不夸张。身上没有明显配饰。 </think> <think> Her pose is calm and aware — one hand gently resting across her lap, the other on the cushion beside her. Slight rotation of torso toward the window. </think> <think> 她的姿势自然且带有克制的优雅,一只手放在腿上,另一只手搭在沙发靠垫上。身体略微朝向窗外,呈现出一种凝视光源的姿态。 </think> <think> Light source is soft daylight, likely golden hour. The interplay of shadow on her left arm and cheek adds depth. Scene lacks artificial light — all natural tone. </think> <think> 光源是自然光,估计是傍晚黄金时段。她的左臂与脸颊有轻柔的阴影过渡,画面没有任何人工光,整体色调温暖、柔和。 </think> <think> No visible signage. Bag rests near her side, black leather with woven texture. Table is round, white marble. Cushions behind her are in muted tones. </think> <think> 画面中没有任何文字元素。她的包是黑色皮革材质,有编织纹理,放在身侧。咖啡桌是圆形白色大理石材质,沙发上有几只浅灰与米色的抱枕。 </think> <think> Framing is medium-close, camera at eye-level. Image has slight mobile softness in contrast, but shadows are clean. The mood feels painterly and still. </think> <think> 构图为中近景,视角与人物视线持平。照片整体对比度稍低,有轻微的手机拍摄柔光感,但阴影边缘清晰,整体氛围有种油画般的静谧感。 </think>
What's with the think tags?
nothing. just throwing a bunch of stuff to see which one sticks. all experimentation
🔥
https://preview.redd.it/q0fajzfe886g1.png?width=1920&format=png&auto=webp&s=345588eb66be0d94d009d68af6ff3227e050edf5
https://preview.redd.it/bfzu49zb886g1.png?width=1920&format=png&auto=webp&s=ae145317de6496241e99f31f3dc59eef8076a0d6
https://preview.redd.it/xi03x0zg886g1.png?width=1536&format=png&auto=webp&s=c8c84cd6e1309b43f9b80f91fac81b1769b2b16b
What’s your prompt for this image (the guy at the bar)?
https://preview.redd.it/19tudvta886g1.png?width=1920&format=png&auto=webp&s=0b6c2c938f17403f41cbb0e605d02f4566617e23
https://preview.redd.it/e9v487mf886g1.png?width=1536&format=png&auto=webp&s=c574ac7c2bad9888c33e814c9b9e040cdf33ebc4
Can you share the workflow outside of RunPod? How much VRAM do you need?
what i noticed is that it uses ~22 gb of vram. the workflow is in the reddit post i added in the body. there is a direct link there from the op
EDIT: just tested it. its going to 35 gb also
how was the audio done? s2v with wan 2.2 or something else? which workflow did you use to sync the audio to the video?
if the video i used a reference has audio, the workflow automatically adds it to the generated video in sync. how freaking cool is that?
wow, this is great. Is this from Hermanmans workflows? Im on his discord but dont follow it all
I am not so sure if its him who added it or the OP of the post that i linked in the body of this post.
How long of clips can you do with quality? I want like a couple minutes, but quality degrades a lot
the max i tried it 196 frames and the quality is top notch. it just takes wayyy too long on a40
how long does it take
30 minutes give or take
lol long is relative, 30 min for 196 frames is killer
Are you using the standard settings in the workflow above for "top notch" quality?
yeah mostly default. like i said i don't fully understand it yet so i just use default settings with only the changes that i highlighted in the body of this post
allright, I will test it tomorrow. What resolution do you use to generate the videos? And what GPU? I've only used Wan Animate myself twice or so. The results were quite good, but the skin and face were a bit muddy
skin problems? this is where Z shines :D
I just used it with qwen. do you mean the input image quality can significally improve the skin quality in these videos? I mean the image wasn‘t too bad
Yes! it is what i noticed. the NSFW videos i generated have so much better skin in the videos than the SFW videos
definitely zimg. best for skin indeed
dang reddit compresses it and the website i used to stitch the videos together also degraded the quality haha. in vlc it looks crisp af lol
it still looks very good!
I can spot a fellow Spanish Chilean accent from a mile away. Was that generated as well or was it a random sample to generate the gorilla from?
i was browsing through insta and i saw that post. she is talking about being hydrated and voting right? i thought ok why not let a yeti talk about it :)
Gotcha, that makes sense. I figured it would be easier to generate a yeti than to recreate a believable Chilean accent. AI is not there yet.
Oye qué onda el weon chileno haciendo WeoN 2.2? Ah? Te caché!
I've been trying w little to no success w removing glare and artifacts around eyes
using 5b tho
How to run it on my Rx 9060 XT 16gb ?
Bro wtf great work
Can RTX 4060 ti run?
That badge tho :D
Aww look at the sexual harassment panda living it up!