Skip to main content

Working with AI is like supervising a maniac

At Polemos, we use generative AI every day to help create content, and although the shock has worn off, I still find it breathtakingly good: it is also like supervising a maniac.

It began with image generation. I love illustrations, and have worked with newspaper cartoonists, magazine illustrators and graphic designers to make one-off pieces like covers and special reports. In the past this was an expensive exercise, because artists can really only create one of two illustrations a day. 

AI image generators have completely changed the equation. Now it is possible to illustrate every article we write with one or two original artworks. Of the generators, I prefer MidJourney, often the images it creates are interesting and beautiful.

Key Characters
An example of Polemos pixel art

But don’t be fooled by what we publish: the images you see are a small fraction of what has been generated. Editorial judgment is more important than ever when dealing with AI.

In MidJourney’s case, one of the things it can’t do reliably is make the likeness of a real person. The technique is to ingest a real photo of a person then ask MJ to output a image in a style you want: in our case, we use the term “pixel art” to generate a kind of 8-bit or faux 8-bit game style. This means the images are heavily pixelated and use flat color, which helps disguise the fact that MJ is terrible at portraits.

It is demographically uneven in this failing: for example, if you keep trying with young white men, it will eventually give you something close to the original, or close enough. With South-East Asian men, it seems to think any Asian man will do. If there is a wrinkle on anyone’s face, watch out: it will turn out wizened old folk. You can spend hours generating image after image and not getting anything usable.

Other image generators I have pressed into service are distinctly edgier. An AI service called Runway turned out the image at the top of this page for my Key Characters interview this week. The input was a perfectly normal picture of a woman and the prompt was “16-bit game”.

I have turned to Runway because it can generate video, but the video it does generate is unreliable, veering from abstract, to static, to nightmarish. I have incorporated one “perpetual motion machine” into a video we are making, but it’s not cost effective at the moment. Attempting to commission “robot typing on a computer”, I instead received lumps of metal nodding to each other. Sometimes, I get videos with “PREVIEW” watermarked across them (we have a paid membership). Perhaps the machine just ingested so much stock footage it thinks every video should have preview stamped on it.

Note the “PREVIEW”

We had also been experimenting with other services that allow you to create speaking avatars. Polemos designer Danny Goldstein wanted to create a talking frog in a tuxedo for one particularly surreal video we are making (about the Reddit strike). The only problem was that when the video rendered, the shirt started talking, not the frog.

The frog’s shirt talks

The holy grail here would be a stock service that generated high quality video on demand from a text prompt. That would usher in a whole new content era based around video, I believe. At the moment it’s still extremely time consuming to make video that is anything but humans talking on camera. Being able to generate good video from text changes that. 

Don’t believe anyone who tells you that era is here already. There are some incredible videos on YouTube of fake Wes Anderson movie trailers, for example, all generated by AI. I think you might be surprised how much skill and and manual labour is involved.

Then we come to language. For my purposes, ChatGPT and other text generators are of much more limited use than image generators. ChatGPT breaks the two commandments of factual content creation (previously called “journalism”): “tell the truth”, and “be interesting”. I’m not sure which failing is more problematic.

GPT 4 lies less than its predecessors, but in my opinion is more boring. Possibly the two variables are inversely correlated. In any case, a staid contributor who makes big mistakes, with no acknowledgement of uncertainty, is useless to me. I did use GPT 4 to draft our Polemos Editorial policy. It gave me something to rewrite.

The really difficult thing for the generators to pull off will be to be the right kind of interesting: not insane, not wrong, but different enough to be worth paying attention to. At the moment, that requires a lot of human intervention.

The power of nostalgia

Sarah West is a departure from my usual Key Characters, because her game The Nemots is distinctly low-budget. I told you about Sarah last week, but this week the podcast is here, and I encourage you to listen. Topics covered:

  • Why Sarah works for the Army as well as The Nemots
  • The need for more women in blockchain gaming
  • How the web3 sector is poorly served by its own language
  • Why 1980s nostalgia works even for people born after 1989

You can sign up on your podcast app (do a search for “Key Characters”) or listen online here.

We got hacked, and so did Frank

It was Polemos’ turn to be hacked this week, with a breach of Discord security that saw a fake CEO account created, a fake Polemos website set up, and some Polemos community members fall victim to scoundrels. Anyone who connected their crypto wallet to the fake site had their assets drained. Fortunately some quick-thinking Polemos staff members acted quickly and shut everything down before too much harm was done. Our story here.

God game Apeiron, run by Frank Cheng, also had their Discord hacked this week. Sometimes this industry reminds me of an oil rig I once visited in north-west Western Australia: there, the moment anything hit the water, sharks would bite it.

This is the web version of our weekly newsletter. Subscribe for free here.

Hal Crawford

Hal Crawford is an experienced journalist and newsroom manager, and the head of content at Polemos.