One Foot in Tomorrowland

William Gibson’s quote: “The future is already here – it’s just not evenly distributed” rang true in my head as I looked back at how I spent this week. Working mostly solo with the goal to meet the ambitiously set deadline, I found myself in a full-on collab with a team of AIs.

I was working on a future-vision video centered around cooking. I pitched the idea to the client without thinking twice about my absolute lack of culinary skills. Welp.

I started my day by describing the meal I was hoping to see to a Large Language Model AI1 (LLM). Together we worked to refine what it should contain. Once ready, I had the LLM construct a step-by-step recipe as well as a shopping list for the ingredients. Both documents were prepared in two versions – using both metric and imperial measurement units.

My project was to do with a human chef preparing the meal, so next I explained the project’s parameters and chef’s character to the LLM and asked it to come up with a script for their narration. It was to provide not just the cooking instructions, but also throw in anecdotes, fill in time with jokes and fun facts about ingredients. The AI did me one better, producing full-on teleplay, incorporating camera takes and shot descriptions alongside the narration.

Next I needed a good logo for the show – one that included delicious-looking food, as well as typography. When it comes to rendering images featuring text, DALL-E 3 is an obvious choice, so I settled on using Microsoft’s new AI-powered Bing search2, which lets you query DALL-E 3 for images3.

DALL-E 3 had no trouble with either the imagery or the hand lettered typography, but my back was getting stiff from sitting in front of the screen so I decided to go to the gym.

I fired off Beat Saber followed by some boxing on my Valve Index and 60 minutes later was ready for a (non-virtual) shower.

For voiceover, what do you know, AI4. I designed a new voice actor with a slight Mediterranean accent and had them read out the lines. With a few tweaks and creative use of punctuation, I got it to emphasize the parts I wanted and maintain fun, energetic rhythm.

Another break. This time I needed some screen-off time. I am lucky enough to live right outside Seattle, surrounded by gorgeous Pacific North-West nature. No matter the weather, I need to get a minimum of 1 hour of sky above, trees and (if lucky) some woodland creatures. Autumn is in full swing and the green-to-red gradients everywhere are at their most vibrant these days.

Nice.

Back at work, I needed some imagery to build a few concept art pieces and partial storyboard. I figured a while back that I can produce best results when I use AIs5 as element generators, but leave composition and detailing to dedicated tools I’ve been using for nearly three decades now. I generated a folder full of elements (and supplemented them with some stock) then AI upscaled most of the images6 to get some extra resolution.

Photoshop and After Effects received their own AI infusions this year. Masking out objects, extending images, removing and replacing elements with generative fill – all these tools have very quickly become incredible timesavers.

With the boards ready, I had to manually put them together in Figma, oh, humanity! and then added a few animated flavors in AE. Animation was mostly done by hand (for now) although GPT-4 was able to code some amazing expressions to automate many aspects of the project through smart rigs7.


While I have been using all these tools individually for a while now, this week was one of the first times when I hopped from one to another in such quick succession – over the course of building a single project.

Truth be told, sure it’s fun to live with one foot in Tomorrowland, but it’s nowhere near as fulfilling as working with human artists.

AI is sure great in a pinch – it will produce results that are good enough for a quick demo or a dreaded MVP8. Still, a human writer will be able to inject their own life’s perspective into a scenario. Voice actor will create a special persona just for the role, or add their own idiosyncrasies to the performance. An illustrator will work slower, but communicate better – and when they go off-script, it will be so much easier to understand why, how and when. Human rotoscope artist… nah, let the AI do the rotoscoping. Fuck rotoscoping.

My only worry is that with AI being great in a pinch, both smaller clients and large corporations will keep creating these pinches – applying pressure to work faster, more efficiently, even if it sacrifices both the quality of the product and the well-being of artists.


1 I worked with Open AI’s GPT-4 for this step.

2 If you want to try it yourself, simply visit Bing AI Chat and ask it to Create an image of a person holding a sign saying “text text” – where you replace the text with whatever you want. If the resulting images have readable text, it means you have the beta enabled. If the text looks like unreadable garbage, log out, log back in, try other browser, until it looks right. It may take a few tries but once you’re in, you’re in.

3 My AI Day took place last Wednesday, and I did not yet have proper access to DALL-E 3 via Open AI. Two days later, it’s been opened for me, so I would probably forego Bing and go directly to the source.

4 ElevenLabs is my choice for high quality voice synthesis these days.

5 I used mainly MidJourney for this leg of the project, though DALL-E 3 helped a bit with a watercolor illustration for a cup of water which MJ kept filling with tea, juices, or ice.

6 Topaz Photo AI can double image resolution, recover lost details and even denoise and de-artifact older media. You can get great 2x with a single click auto-settings or go even larger, tweaking the dials to get the result looking just right.

7 With background in motion design I am no stranger to wild rigs and crazy expressions. Whereas coding them would usually consume a day or two, this time the code was ready nearly instantaneously – as soon as I clearly explained my vision. Bugs were there, but I was able to debug the code simply by telling GPT-4 what errors I encounter.

8 Minimum Viable Product