The Struggles of Being Helped

I went a little overboard with the previous edition of what’s supposed to be a series of byte-sized posts. This time I’ll be shorter.

I am uncertain whether it is the honeymoon period coming to an end with my rose-tinted glasses fading, or if the quality of most popular AI models fluctuates wildly over time; Recently I keep “catching” AI fumble more and more frequently.

The very same prompts* that worked a few months ago, now lead to LLMs hallucinating facts. Image to text seems to be back at its inability to count limbs and digits, or to shake off some deeply embedded stylistic choices attached to certain keywords, despite explicit prompting.

Meanwhile, project managers and budget-conscious CEOs worldwide continue eliminating “redundant” positions in hopes of artificial intelligence replacing human employees.

May I suggest it’s not the time yet?

 

Just the other week I was working on a simple storyboard for a small project I’m participating in. Time and budget constraints did not give me the opportunity to hire a storyboard illustrator and my own drawing style is way too cartoonish for the task. No problem – we can just use DALL-E 3 and MidJourney and do the job in no time at all. How silly.

If you ever ended up with a bad apprentice or an intern who’s full of good intentions but really does not belong in your discipline – first of all, sorry, I feel your pain – second, you feel my pain, too.

The AI models tend to be very stubborn – when their idiosyncrasies or straight up mistakes get in a way, they listen to feedback, apologize, then do the same thing again.

As the days went by I noticed that I was spending less and less time being creative and prompting new things. In turn, majority of my effort went into trying to trick the models to listen to my instructions or to evaluate their work and spot the mistakes.

There’s a simple trick AI does surprisingly well – and that is creating results that look extremely good at the first glance. Because of that, it takes a significant cognitive load to evaluate AI’s output if you want to maintain any level of quality control.

A drawing of woman drinking tea while looking at her phone sounds like a simple enough ask, yet half of the generations placed a cup of tea in her hand then another one on the table in front of her. Others had her drink out of her phone. Others yet would give her extra limbs, have her grab the cup in impossible ways, etc. etc. In the final tally I generated over a hundred images until I arrived at an OK-looking one, and even with that I had to compromise on the composition.

 

What’s the point of this rant?

I guess just a casual reminder that we are still in very early days of AI. It’s dazzling, but it’s not perfect. Often it’s not even good-enough.

In some aspects it reminds me of Stanisław Lem’s “Solaris” – a story of an imperfect god that ends up hurting people instead of helping them. Let’s hope we’ll stop short of worshiping tech that’s still vastly immature and unproven.


* I follow and double-check logs whenever possible, though some of these observations are anecdotal, based on my memory only.