This blog post (which you can watch me talk through on youtube) rambles through a bunch of related topics and starts with a very shallow dip into neural networks and the breakthroughs in their use.
Here is an image made by MidJourney (a neural network) of a neural network:
(a self portrait?)
I honestly do not get very technical in this next section, but if you are allergic to such content, scroll down and look for the "Back to art!" heading.
***
I spent 20 or so years as a research scientist, looking into topics which all sheltered under the umbrella lay-man's term Artificial Intelligence. I did brush up against neural networks occasionally, and coded a couple, but it was not a serious involvement. I am not a neural network expert - take what I say with a pinch of salt.
A neural network is a bit like a hammer: a very useful tool for many jobs, but often not the best tool. When you have some understanding of the problem you're working on, and that understanding can be represented mathematically, then there are very often far better ways to make efficient use of that understanding than to use a neural network. And that's what I spent my time on.
However, neural networks are a powerful and very adaptable approach. One reason many scientists have sidestepped them in the past is that often they will work - but we don't know HOW they are doing what they are doing, and so we don't learn anything.
In the late 1800s scientists started to talk about the function of our brains in terms of the interconnected network of neurons.
In 1943 the first computational models of such networks were made (though with no hardware to run such algorithms on, other than paper and pencil).
In 1954 a neural network was first implemented on a computer (a very basic one).
In the 1980s interest (and computer power) surged and neural networks started to be spoken about more generally in scientific circles.
In 1992 I coded my first one.
Neural networks are very simple things. You have an input and output and at least one hidden layer. The input is a number at each node - the output is a number at each node. The numbers could be anything - they could be the pixel values of an image. You could put in one image and get out another. Or you could have just two outputs and get your image turned into two values - perhaps a yes or a no.
You can have extra "hidden" layers:
But when I took my first job as a scientist (for the Royal Radar And Signals Engineers), I learned that some clever mathematician had shown that every neural network with more than one hidden layer could be replaced by a network with just one hidden layer. So what was the point of all that extra work? None, was the consensus, and we worked with just a single hidden layer.
Those lines joining everything to everything else - you can imagine them as pipes, the inputs as water. The only thing that changes in the network is the width of those pipes - or more technically: the weighting on those connections.
We train the network (supervised training) by giving it inputs for which we know the desired output - we might feed in many pictures of dogs and cats and have two outputs. We want the "dog" output when there's a dog in the picture. The weights on the connections (bore of the pipes) are adjusted in an attempt to get the correct answer as often as possible.
And that's it. Then, we hope that when the network is given a new picture, of a cat or a dog, one it has never 'seen', it will give the right answer. And if we have been careful about the data we trained it with and the way in which we adjusted those weights ... it will!
Understanding the method by which it has done the job is a task that lies somewhere between difficult and impossible. Although we have a dog/cat detector, we haven't really learned anything about HOW to tell dogs from cats.
Anyway - for the next twenty years or so it felt to me (in my small corner of science) that this was the state of play. Computers got more powerful, data became easier to gather and handle, neural networks slowly got better at doing stuff. But they weren't GREAT, they didn't solve every problem.
Two (to me) unexpected things happened.
1: Layers - these days neural networks are like ogres who are like onions who have layers.
Something called DEEP LEARNING came along. That clever mathematician who said you only needed one hidden layer wasn't wrong - but another clever scientist/s pointed out that when you added in those unnecessary extra hidden layers your neural network suddenly became much easier to train.
Neural networks started to learn from data much more efficiently.
2: Scale
Scientist generally like to be subtle, efficient, and clever. The way to solve a problem is not to throw more and more resources at it. Fermat's Last Theorem was not proved by checking if it held for every possible set of numbers.
However, as computing power got better and better, someone/s noticed that if you just went and built a fuck-off huge neural network, say 1000x bigger than sensible - it didn't just do the same thing faster ... surprisingly it did whole new things, it used that space to solve problems it couldn't solve before.
This was a very surprising result. Maybe not to non-scientists, but trust me, it was unexpected.
A lot of the advances in performance in speech recognition, translation, and (probably) in the recent art AIs, is in large part a product of just making these networks BIG.
The drive to bigness is starting to move the process out of digital computers and into the analogue world of more traditional electronics. It's a brave new world out there, and don't be surprised if someone builds an electronic brain in your lifetime - not a computer but a piece of kit whose workings are obscure but whose results are remarkable.
So ... back to ART!
Recently some AI art applications have been producing remarkable results. These are neural networks that have been trained on vast databases of imagery. These images will have been primarily taken from the internet. The training will have been complicated and clever, but probably boils down to having prompts like "a dragon" "a cat in a hat" "flying dog" etc as the inputs, and having the internal weightings adjusted to make the output 'better' - a judgement that probably required a human to do an awful lot of judging.
This training is still going on, and if you use these AI's you may well be part of it. Midjourney gives 4 images for each prompt and offers the option to add detail or produce variations of any of them. When you choose to 'upscale' one of your four images, or produced variants of one of them, you are telling the AI that image number 3 was the 'best' answer. And it learns from that.
There is some controversy around this training (a lot of it understandably from artists) as the neural networks will have learned how to do what they do by using the work of artists online. Midjourney can be explicitly asked to copy the style of an artist. It also knows what famous people look like since it has learned their names and their images from the web.
Witness: I typed /imagine "Donald Trump on fire in the style of Renoir" into MidJourney and got this:
Obviously, when framed in a "training your replacement" sense, it would be pretty galling to have your art used by your competitor. I would not appreciate an AI reading all my books then churning out new ones "in the style of Mark Lawrence".
It's a complex question. We all fear change on some level. Change doesn't care and stomps all over us. Ned Ludd inspired the Luddites, who set about breaking the weaving machines that were stealing their livelihoods. Artists were shocked by photography - it seemed to render many of their hard-won skills pointless.
Real artists learned their trade by studying the work of others, by being taught by artists.
Some might say that the training of an AI is the same thing. It's looking and learning.
But when the AI learns some part of your signature along with your style and slaps down a vague approximation next to its next creation ... how to feel?
When I typed I typed /imagine "fat dragon wearing a hat" into Midjourney I got this:
Look more closely:
Is that an actual Japanese signature - or does Midjourney just "know" that Japanese style dragons come with a signature and this is its own unique signature?
#########################################
(update - 12/11/22 the above fat dragon wearing a hat was Version 3. Here is the same prompt for version 4:
################################################
I asked /imagine "dragon blueprint"
And got this:
The background writing is gibberish. Maybe the signature was too...
What I can say is that the floodgates have been opened and there's no holding this tide back. These types of application will develop rapidly, becoming more responsive, more sophisticated. Perhaps images will animate in real time, following verbal commands. Perhaps our houses will be wallpapered with screens that can show images created in real time to all manner of cues.
Who knows, we're living in the future, baby. We didn't get jet cars, we got this stuff.
(yup - more AI art I 'created' in seconds)
(and yes, there's a "signature", and if it is recognisably some artist's jet car and some artist's signature then that would adjust my attitude here - but I think (could be wrong) the AI just knows that stuff in a 'painted' style needs a signature and slaps down a pretend one)
So, is Midjourney copying?
In pursuit of this issue I gave Midjourney the prompt /imagine Hopeless Maine, "Tom Brown".
Tom is an artist friend of mine that I've known for a decade and whose work I'm a fan of. He has a VERY distinctive style, involving a dark pallet and an abundance of intruding tentacles and floating eyeballs. Check out my review of the first
Hopeless, Maine graphic novel.
I was amazed (as a non artist) by how closely Midjourney mimicked Tom's style. I could wholly believe that the images produced came directly from his published work:
For the record, images 2,3 & 5 are Tom's work. Tom himself was impressed and surprised. His artist's eye could see significant technical differences, but he gave the Midjourney images A to A+ for quality, and said that he could see no signs of stealing - the images are wholly original, just using a similar colour palate and style.
I've canvased Tom for a broader reaction and will add his thoughts here soon.
***
Am I going to populate my blog with AI art? Absolutely, it makes it better and more interesting. Am I paying $10 a month for Midjourney? I am!
Are artists losing out? They are not. I would never, and have never, gone out to ask an artist to produce art for my blog (except in a general sense when running art contests). It would be too expensive and too slow. So in this particular instance - who loses?
More broadly, yes, I feel bad for the upheaval artists will endure. Will we start to see covers drawn by AI? ...I'm certain of it. I'm not saying it's something I wanted to happen, but publishers always want to save money. If they can get away with stock art instead of paying an artist for something bespoke - they'll do that. If they can save the artist's fee with AI, they'll do that. Water runs downhill.
I feel bad for the artists. I'm sure there will always be a place for them, but there will be change, recalibration will be needed. If/when the AI comes for authors ... well, I think to write a book they would basically have to be intelligent, so I feel I'm safe. I'll die before they arrive. But, to be honest, I would not have said that AI art would come this far any time soon. So what do I know?
Join my Patreon.