Good evidence Z-Image Turbo *can* use CFG and negative prompts

Full res comparisons and images with embedded workflows available here.

I had multiple people insist to me over the last few hours that CFG and negative prompts do not work with Z-Image Turbo.

Based on my own cursory experience to the contrary, I decided to investigate this further, and I feel I can fairly definitively say that CFG and and negative prompting absolutely have an impact (and a potentially useful one) on Z-Image turbo outputs.

Granted: you really have to up the steps for high guidance not to totally fry the image; some scheduler/sampler combos work better with higher CFG than others; and Z-image negative prompting works less well/reliably than it did for SDXL.

Nevertheless, it does seem to work to an extent.

62 points RiskyBizz216

Thats a rough 38...This guy is at least 48 yrs old tho.

parent

36 points YentaMagenta

Don't smoke, kids

parent root

12 points eggplantpot

Same way all asians look young to us, we must look really old to them

parent root

5 points Familiar-Art-6233

It’s in the skincare and SPF.

In the States, skincare products tend to be viewed as an older woman thing, whereas in many Asian countries (especially Korea), it’s viewed as a standard step of personal hygiene.

Also doesn’t help that the US has effectively banned any new SPF products so we’re about 40 years behind the curve unless you import it

parent root

4 points willtheywonttheyo

Yeah dude I’m like mid 30s this dude looks old AF to me I had to double check the mirror

parent root

3 points jvachez

Yes, the main problem is hair too many white hair.

parent root

6 points jib_reddit

Some peoples hair starts going grey at 18 others at 50 , it is totally individual.

parent root

3 points JoelMahon

I've been nearly as grey as him since 25 mate, his looks particularly "bad" for his age because his non grey hair colour is light brown

parent root

1 points roychodraws

He dyes it gray.

parent root

-6 points Melodic_Possible_582

no. this looks about right. I know plenty of white people around this age to 45 and they're old like this. they get mad all the time when i guess their age around 50. lol. i'm in my mid 30s and people keep guessing my around around mid to late 20s.

parent root
23 points Total-Resort-3120

It's still better to use NAG on distilled models though.

https://www.reddit.com/r/StableDiffusion/comments/1pbrbrt/nag_normalized_attention_guidance_works_on_zimage/

https://preview.redd.it/bc84vrwh966g1.png?width=3072&format=png&auto=webp&s=8cb57c7beafc6b9899e2a781c6df947a12b017fc

parent

11 points kukalikuk

Try removing items by putting it in negative, just like OP did, just to prove NAG has the same effect.

parent root
11 points Niwa-kun

For me. usually getting the CFG to 1.2 is enough to preserve style and allow negs to work.

parent

5 points YentaMagenta

In my tests, something I found is that the more negs you add the higher you need to take your CFG. Based on my (puny) understanding of the multidimensional latent space, this is not surprising.

parent root

1 points dreamyrhodes

It really drops in speed when you go past CFG 1 tho.

parent root
4 points HardenMuhPants

It's good at around 1.4-7 cfg, actually improves the images and prompt adherence a decent bit too. Who decided cfg didn't work other than people who didn't actually try it.

Also any robust lora that isn't a single concept will remove some of the distillation requiring more steps and using cfg. So if you use a high end lora you might have to do these things anyways.

parent
22 points Jaune_Anonyme

CFG can work. But is on average and usually harmful to the distilled model.

You're bruteforcing it (while also making render time higher) to go against its training.

A distilled model mimics the teacher model CFG, basically mimicking the scale taught by the base/teacher model. It allows it to get the guidance down faster, with the tradeoff of little variation/versatility.

In other words, CFG is already "baked in" the model, making it "useless" to toggle.

By using it, you're pretty much losing the benefits of having a distilled model in the first place while arguably not gaining much.

parent

2 points alb5357

Do you mean that e.g. negatives are baked in? Like a distilled model would have difficulty producing 6 or 4 fingers, because unwanted elements were kinda baked in as negatives?

What about nodes such as skimmed CFG?

parent root

1 points YentaMagenta

I mean, it's clearly not ideal, especially compared to the way it works with something like sdxl.

Nevertheless, it does work in a pinch and, somewhat interestingly, does seem to help create a smidge more output diversity.

parent root

2 points jib_reddit

This node generates great image variance with Z-Image and is tuneable: https://github.com/ChangeTheConstants/SeedVarianceEnhancer

parent root

6 points Jaune_Anonyme

Of course it will create diversity. The whole point of a distilled model is to ramp up speed by killing off the CFG interference.

Please look up what's CFG and how distilled models work. You'll understand why people are telling you "it doesn't work"

SDXL base (and most models used by the community) isn't distilled so yes it is made in mind to have CFG used.

In the case of Z image turbo, it being distilled, you're fighting a losing battle by enabling the CFG. Once the training backed the base model CFG into the distilled model weights, it's actually quite detrimental (speed and quality wise) to use it back again.

Sure if you don't care about either of those, and absolutely want to get rid of a random detail, go for it.

parent root
21 points 8RETRO8

Tired of these distilled models purists popping up everywhere where cfg>1 is mentioned and being like, "Uhhhh, ACTUALLY, you are not supposed to do it🤓." Yes, I know, and it doesn't matter if the image is better.

parent

13 points FoxBenedict

I got downvoted for saying negative prompts work fine in ZIT when it first came out even though I posted examples. Because "it's distilled, so it's not possible" decided the scientists on this sub.

parent root

8 points roller3d

I mean a large group of people on this sub seem to think previous prompts will influence later prompts and there's something more than just math happening in the models. 🤷

parent root

2 points QueZorreas

That can sometimes happen, but I think it has something to do with cache-ing in some WebUIs.

parent root

3 points Familiar-Art-6233

With the former, that makes sense if they come from using ChatGPT because it absolutely does. It doesn’t here, but I can see the confusion.

The other part… ugh people who try to personify AI are so irritating

parent root

-4 points ReasonablePossum_

Wouldnt say that "fine", as it often ignores them and gets polluted by previous generations. But they definitely kinda of work lol

The resetksampler us quite useful with the model

parent root

6 points Analretendent

Lol, yes it's a bit like they are saying birds can't fly while standing at a beach watching them in the sky.

I don't think they're wrong with the technical aspects, but from the images we can clearly see it has an effect. Unless OP is faking it, you can remove stuff by putting some words in the negative.

Right or wrong, I see birds fly, and therefore I believe birds can fly. If I saw a flying car I would believe that too (after some investigating).

parent root
12 points Melodic_Possible_582

that's the problem with most people. they don't try it for themselves. literally, the first couple days zimage came out they already stated that negatives don't work, but i noticed one can go above the 1 CFG. so i tried it and it worked. no one wanted to listen to me, so there's that. lol

parent

2 points Perfect-Campaign9551

Nobody said negatives don't work. What we are saying is, if you turn CFG above 1, it will burn almost instantly.

parent root

3 points Next_Program90

It takes double the time, but it doesn't burn in my case. Actually handles the very greyish images for me. I use low CFG values like 1.5-2.5.

parent root

5 points jib_reddit

It also takes double the time to generate if you include the negative with cfg > 1.

parent root

6 points red__dragon

ZIT's already thrice as fast as Flux on my machine, so twice as slow is still faster.

parent root

2 points YentaMagenta

My examples above prove that they do not necessarily burn almost instantly, especially if you change other settings to compensate.

parent root
3 points Striking-Long-2960

They clearly work, and increasing the CFG scale along with using more steps can significantly improve the quality of the final image. Combining LoRAs also works very effectively, even applying negative strength to LoRAs, though it feels like we have to rediscover the same techniques over and over again.

parent

-2 points YentaMagenta

Tell that to the people in the other post of mine that keep insisting I was doing generations "wrong" 😜

parent root

2 points dorakus

If by "wrong" you mean "out of spec" then yes. The problem was that YOU WERE DOING COMPARISONS while using parameters outside those indicated by the model creators.

parent root
2 points prompt_seeker

you may try using scheduled cfg node from kjnodes to avoid overbaked image (and faster than cfg>1), or NAG is another option.

parent
2 points simple250506

This is very interesting. In your conclusion, do you think 2.5 is the lower limit for reflecting negative prompts?

parent

2 points YentaMagenta

Great question! I do not think it's the lower limit. Based on a variety of tests I think that 1.1 is (as you might expect) the ultimate lower limit. However the more negatives you want to include and the more closely associated the thing you want to remove is associated with the subject of your image, the higher you will need to crank up CFG.

And at some point, though, negative prompting will not work. For example Z-image believes very strongly that dogs should have collars at all times. So if you try to negative prompt away the collar it is very difficult, even with high CFG.

parent root
2 points Etsu_Riot

I haven't use CFG lower than 2 since ever. It increases the contrast, which is something I like.

The use of negatives to remove objects in the scene sounds very useful.

parent
3 points shootthesound

Anytime a “distill brigade” members tells you doing it wrong by going past one, ask them since when has any creative tool had only one way to use them. You don’t criticise a painter for using a particular brush stroke telling them their faces will be less accurate, because those outside the creative process for a given piece are not privy to the creators intention and should to be honest stfu. As long as people know what’s the “defaults” are , let them explore the edges where creatively and not conformity is found.

parent

0 points YentaMagenta

Ah but you see I was saying nice things about Flux 2 and pointing out that there are at least some subjects where it has better model knowledge than Z-image, so naturally it must be because I was simply using the wrong generation settings or prompts that Z-image doesn't know Jabba the Hutt or what a hood hair dryer look like. 😛

parent root
1 points a_beautiful_rhind

I used automatic CFG warp drive and CFG norm, then I could raise CFG without burning and have negative prompts. Unfortunately it slowed down the gens way too much for my daily use.

parent

2 points RandallAware

Try just applying it to the first 2 steps.

parent root

1 points a_beautiful_rhind

Heh.. that's a good idea.

parent root
1 points Next_Program90

I'm using CFG most of the time tbh.

parent
1 points LosinCash

Are you detailing 'Positive' and 'Negative' in the same or separate nodes?

parent

1 points YentaMagenta

Separate. You can follow the link to download the pngs with embedded workflows

parent root

1 points LosinCash

Great. Thanks.

parent root
1 points stroud

Lmao those are not 38 year old men... those are like 54 year old

parent

1 points YentaMagenta

I mean, I'm told that Z-image can do no wrong, so I guess 38 is the new 54.

parent root
1 points dorakus

This guy again. We get it, you love flux, go marry it.

parent
-2 points Perfect-Campaign9551

Nobody said negatives don't work. What we are saying is, if you turn CFG above 1, it will burn almost instantly. So don't use it! The negative prompt should not be used because of this.

parent

5 points 8RETRO8

Examples above are not burnt

parent root

0 points YentaMagenta

"The champagne is buhrned..."

parent root

0 points No-Zookeepergame4774

Yeah, but its not true at all. CFG around 2 doesn't usually result in burned images, with or without negative prompt; I’ve seen workflows that split generation into multiple phases and use CFG up to 4 for parts of the process and do very well.

parent root
-2 points BathroomEyes

The Z-Image-Turbo paper says the model uses CFG

“Due to the inherent iterative nature of diffusion models, our standard SFT model requires approximately 100 Number of Function Evaluations (NFEs) to generate high-quality samples using Classifier-Free Guidance (CFG) [29]”

parent

3 points No-Zookeepergame4774

That's a reference to Z-Image Base (“our standard SFT model") that uses 100 NFEs for generation in their preferred configuration (50 steps, since you double NFEs per step with CFG); Z-Image Turbo they state uses 9 NFEs (9 steps without CFG), but you can obviously set more steps and use CFG, and CFG around 2 does seem to have benefit for some generations, IME.

parent root

2 points dorakus

Can you read? They are talking about the BASE model not the TURBO.

Jesus fucking christ.

parent root