How Good is AI Image Generation?



  1. Such an engaging explanation of homophones and homonyms. I appreciate the global perspective — it’s so true that the same…

In this post, I’m veering slightly off-topic to explore how effective (or ineffective) AI Image Generation can be. Away from linguistic gymnastics and the anagrams of Thirimuntakoba, I’m treading closer to the Vandaarkuzhali that AI imagines.

While working on a specific post for this blog, I experimented with various AI image generators, tweaking prompts to achieve the right visual representation. Below are some intriguing results.

AI Image Generation of an Indian Auto Driver

I set out to narrate the story of an auto driver—tired, red-eyed from sleepless nights, with disheveled hair and stubble. In my mind, he wore a blue shirt with a loose khaki uniform jacket tossed over it to avoid police scrutiny.

My first attempt was with imagine.art first, but the results were disappointing. The auto driver, inexplicably, kept holding a steering wheel, and no matter how much I adjusted the prompts with terms like “auto” and “tuktuk,” it wouldn’t move past this image.

Next, I tried openart.ai

Here’s the prompt I used, which I modified progressively to achieve more accurate results:

An autorickshaw driver from India with the uniform khaki shirt loosely worn over his blue shirt, eyes red due to lack of sleep, hassled hair and stubbles.

The initial images were far from ideal. The AI generated a central hair parting and a kumkum/kungumapottu on the driver, which didn’t fit the image in my mind. Additionally, the phrase “khaki shirt loosely worn over the blue shirt” was interpreted as khaki patches or dirt smudges on a khaki shirt. The third image of this series, where the inside of the khaki shirt was oddly painted blue, was particularly jarring.

Let me make AI understand!

Since the generated images had a distinct North Indian vibe, I decided to tweak the prompt to specify “from South India.” Surely, this would fix the center-parted hair!

Thankfully, the parting did shift to the sides, but the kumkum/kungumapottu remained. It made me wonder if, to AI, all Indians are born with that mark on their foreheads. Listen up, fellow Indians—you apparently come with it as a manufacturing defect!

Sigh 😮‍💨

Why not use another model?

Besides not confirming to my prompt, the images looked very realistic that I thought it might be a better idea to choose another model DynaVision XL.

That Was Scary!

While I appreciated the disheveled look, the auto driver’s expression became disturbingly intense. His eyes were blazing embers—quite literally. In image D6, the AI ignored the blue shirt entirely, and what was that wiry thing in place of the kumkum? The “stubble” had somehow turned into a short beard.

Back to the OpenArt SDXL

Frightened by these results, I quickly abandoned the model and switched back to the OpenArt SDXL model. Here are two of the last three images generated.

The kumkum/kungumapottu stereotype persisted. I wasn’t thrilled with D8—the face looked more cunning than anxious, the “stubble” was more of a short, but groomed beard, and though the eyes were slightly red, the blue shirt was still missing. D9 had a better stubble and messy hair, but the bead-like eyes made the driver resemble a toy.

Final Choice

As I went through this process of AI Image Generation, it felt like playing a game of Whac-A-Mole or Whack-A-Mole. Try playing it here! I was mildly frustrated by the clichés and disappointed that the AI ignored key details from my prompts.

Have you tried AI image generation? What’s your experience been like? For me, it was a mix of fascination and irritation—especially when it skipped over the blue shirt or the anxiety I wanted in the driver’s expression. Share your thoughts!


Discover more from Menaka's Blog: Words and Worlds

Subscribe to get the latest posts sent to your email.


One response to “How Good is AI Image Generation?”

  1. JulianSmith Avatar

    Your exploration of AI image generation highlights both its potential and its quirks, especially with the persistent kumkum stereotype and the elusive blue shirt. It’s a fascinating yet frustrating journey, much like teaching a stubborn artist who insists on their own vision!

Leave a Reply

Discover more from Menaka's Blog: Words and Worlds

Subscribe now to keep reading and get access to the full archive.

Continue reading

Discover more from Menaka's Blog: Words and Worlds

Subscribe now to keep reading and get access to the full archive.

Continue reading