Among the things that pleasantly surprised me about Z-Image is how well it understands emotions and turns them into facial expressions. It’s not perfect (it doesn’t know all of them), but it handles a wider range of emotions than I expected—maybe because there’s no censorship in the dataset or training process.

I decided to run a test with 30 different feelings to see how it performed, and I really liked the results. Here’s what came out of it. I've used 9 steps, euler/simple, 1024x1024, and the prompt was:

Portrait of a middle-aged man with a <FEELING> expression on his face.

At the bottom of the image there is black text on a white background: “<FEELING>”

visible skin texture and micro-details, pronounced pore detail, minimal light diffusion, compact camera flash aesthetic, late 2000s to early 2010s digital photo style, cool-to-neutral white balance, moderate digital noise in shadow areas, flat background separation, no cinematic grading, raw unfiltered realism, documentary snapshot look, true-to-life color but with flash-driven saturation, unsoftened texture.

Where, of course, <FEELING> was replaced by each emotion.

PS: This same test also exposed one of Z-Image’s biggest weaknesses: the lack of variation (faces, composition, etc.) when the same prompt is repeated. Aside from a couple of outliers, it almost looks like I used a LoRa to keep the same person across every render.

69 points yobo9193

https://preview.redd.it/vwfw98c42a6g1.jpeg?width=1043&format=pjpg&auto=webp&s=544dc9b931591c8e1988dcb0249adc7e75512aef

parent

15 points oromis95

mugshot lol

parent root

4 points laplanteroller

the mug is full

parent root

2 points Big0bjective

lmfao I knew Dr. Aroused has criminal ties but not like this

parent root

3 points -Ellary-

So this is how half of the sub looks, hmm.

parent root

2 points target

LOLOLOL

parent root
21 points gabrielxdesign

Well, my "aroused" is definitely not like that, lol

parent

10 points lazyspock

https://preview.redd.it/lr7ot6pg2a6g1.png?width=275&format=png&auto=webp&s=b6a76589e1d1e2949288394a79ec3b72955d7c37

parent root

2 points gabrielxdesign

parent root
22 points hidden2u

anti west bias in “menacing” lol

parent

12 points elbowedelbow

'Menacing' ethnicity straight up changed lol

parent root
10 points MathematicianOdd615

How about the NSFW face expressions? 😉

parent

19 points vault_nsfw

https://preview.redd.it/px74jr1ax96g1.png?width=1079&format=png&auto=webp&s=35cd49ec8b916c1a5aa8f73a3d3f2915ec2c67f1

parent root

2 points target

I am sure there is a lora for that already .. dripping off the tongue

parent root
7 points aStoryInPictures

lmao love that the distracted guy is the only one not facing the camera

parent

1 points lazyspock

Exactly! He was so distracted that he missed the click! The aroused one is also funny, he is somewhere between "this woman is nice" and the "O face" from the "Office Space" movie.

parent root
3 points dariusredraven

Ill split the difference between the sfw and the nsfw. Try sultry or flirty.

parent
2 points comfyui_user_999

Shouldn't the fun guy have a cap?

parent
2 points Etsu_Riot

What I find most surprising about this is that I keep seeing how people still think one of this model's best features is actually its weakness.

parent

6 points lazyspock

This depends on what you want to do. I know that if you give a detailed description of the composition, scene, etc, in the prompt, it will do what you ask for with remarkable precision (therefore solving the problem of the lack of variation for compositions). But the face is not that easy, I've tried random names (mostly don't have any effect), nationalities (they work, but every nationality has an almost identical face between renders), detailing the facial features (somewhat works, but not for face format, etc)... The only real solution is a LoRa, but then the LoRa bleeds to all faces in the render.

I'm absolutely LOVING the model, don't get me wrong, but this can be a feature or a weakness, it depends heavily of what you want to do with the model.

parent root

4 points Etsu_Riot

I have got great variation on the faces by prompt alone. You don't need LoRas at all. Maybe there is a limit on how much variation you can get, but so far I haven't found it. Remember that real humans are not as varied either. We are made of archetypes.

parent root

1 points ageofllms

Would a bit more context help? Seeing how this model likes detailed prompts. Instead of just 'surprised' you could say surprised as he's found out his bank account is empty :D or terrified as he witnesses a giant monster ripping someone's head off. Hehe. Some people think you don't mention things that arent visible but I think it's often very helpful to provide emotional context.

parent root

1 points hugo-the-second

love your analysis, couldn't agree more

parent root
2 points Saucermote

Good thing that the LLM it uses can figure out most our spelling mistakes. "Irritatd" is up there. Although I think it is basically a higher definition version of angry.

parent

3 points lazyspock

In fact I wrote it correctly (IRRITATED) but tried twice and the Z-Image misspelled it twice (the other misspelling was way worse), so I gave up. 😂

parent root
5 points Melodic_Possible_582

menacing turns into a white guy. LOL

parent

4 points theholewizard

Kinda irrelevant to the whole Z image thing but I find it interesting that 100 years ago that guy wouldn't have been considered white. Whiteness is a political project. Italians / Mediterranean people were only allowed in the whiteness club when it became useful for the Anglo Saxons in the US. (I'm not Italian, not trying to get oppression points)

parent root
3 points TopTippityTop

Turns out a menacing asian is a white man.

parent
1 points kaelvinlau

Gonna generate a serious + determined + blank stare and see what results its going to give me

parent

2 points lazyspock

I've tried some combinations. Most of them gave me nothing different from one of the feelings. Some of them (for example "sad smile") worked as intended.

parent root

1 points kaelvinlau

Haha yeah, that's expected. Just joking around to see if the same facial expression will somehow generate something entirely different 😂

parent root
1 points Dr_Lurky_Lurkerson

You forgot embarrassed.

parent
2 points ANR2ME

That menancing person doesn't looks like an asian while the rest of them asian 😆

parent
1 points Atomsk73

Some don't work and become "neutral". You could also try "amused", exhausted, sour, disdain, smug, etc.

parent