Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> DALL·E 2 struggles to generate realistic faces. According to some sources, this may have been a deliberate attempt to avoid generating deepfakes.

That might be true, but after experimenting with DALL·E 2 last week (and spending more than $15), I have a different theory.

My tests focused on how well it could create art works around three common themes: still life, landscape, and portrait. For the first two categories, almost all the results were works that would not have looked out of place in a museum or art gallery. In contrast, with the prompt of “A painting of a young woman sitting in a chair” and variations, while DALL·E 2 produced convincing clothing, furniture, background, etc., the faces were mostly horrible. I started adding “from the rear” and “turned to the side” to the prompt just to get the face out of the picture.

I came to suspect that DALL·E 2 is bad at faces not because the developers made it that way but because human beings are uniquely hardwired to recognize faces. Most people are able to recognize and remember hundreds of faces, and we are very sensitive to minor changes in their configurations (i.e., facial expressions). When we look at a painting of a person sitting in a chair, we don’t care if aspects of the chair, the person’s clothing, etc. are not precisely accurate; a slight distortion of the face, however, can ruin the entire work. DALL·E 2 does not seem to have been trained to have the same sensitivity to faces that humans have.

If anyone is interested, the works that DALL·E 2 created for me are at [1]; video slideshows with musical accompaniment are at [2].

[1] http://www.gally.net/temp/dalleimages/index.html

[2] https://www.youtube.com/playlist?list=PLj4urky_8icRPzgFS_b98...



It's only small faces that are distorted, and they are often heavily distorted, it's not an "uncanny valley effect", they look like disfigured pieces of meat and skin. It's the same in dalle-mini.

Dalle2 can clearly generate super-realistic faces without any problem, if you look at most of the posts at r/dalle2

The issue with small faces might be architectural if there is context-aware upscaling going on in the network, where a face needs to start larger than some smallest scale or it won't survive that process. That in turn might be an issue of too little training. A small face in a photo in the training data won't generate as much error gradient if it goes wrong as a larger face, but as you suggest we as viewers are much more prone to scrutinize faces even though they are small.


That's probably not the reason. Generating faces was one of the first things GANs were ever used for. They can make near perfect faces because the internet is flooded with images of faces, often high quality celebrity shots.

The reason it can't do faces well are very likely due to the filters being applied to try and stop people making pictures of real people. This is probably also the explanation for the random misses where it paints pictures of something that's not a llama. OpenAI is rewriting queries to make them more "diverse" i.e. acceptable to leftist ideology, and their rewriting logic seems to be completely broken. There have been many reports of people requesting something without even any humans in it at all, and discovering black/asian/arab people cropping up in it. At least earlier versions of the filter involved simply stuffing words onto the end as proven by people requesting "Person holding a sign that says " and getting back signs saying "black female" etc.

Man asks for a cowboy + a cat and gets a portrait of an Asian girl. Gwern comments with an explanation:

https://www.reddit.com/r/dalle2/comments/w7qvgl/comment/ihm6...

"tldr: it's the diversity stuff. Switch "cowboy" to "cowgirl", which would disable the diversity stuff because it's now explicitly asking for a 'girl', and OP's prompt works perfectly."

Big discussion thread where people discuss the problem and (of course) the censorship that tries to hide what's happening:

https://www.reddit.com/r/dalle2/comments/w944fa/there_is_evi...

"I once tried some food photography and received a cheese with a guys face for no reason."

"This has been mentioned on this sub multiple times, but those threads have consistently been removed by the mods - as will this one."

"There was a thread about that prompt and, yes, the person did get diverse [sumo wrestlers]"

"Been doing women images and seeing the article decided to try narrowing the results to "caucasian woman". Still gave me diversity. Whether you want it, or not, you're getting diversity"




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: