So I trained a GAN

Some thoughts about AI and art

Dec 11, 2024

So these collages don’t really exist. I generated them over a fews days using Artificial Intelligence (AI) and there are thousands of them! There has been a lot of concern recently about the takeover of this technology, so I wanted to share my experience with it. It definitely had an affect, but not in the way you might think!

I started making daily collages back in 2018 as a way of injecting some intuition and routine into my art practice. I had taken up painting seriously over 10 years before, making oil paintings depicting various landscapes and subjects, but I was feeling stifled by my use of reference photos and frustrated that I always had a sense about how the image I was working on would turn out. I wanted to explore a looser approach to making art where the results weren’t so pre-determined and the collages became a way to explore colour relationships, composition and mark-making without the pressure of making sellable work. By 2021 I had amassed thousands of these collages and was posting one a day to social media.

Around that same time, there was a lot of buzz about recent developments in computing with the public introduction of large language models (like DALL-E in 2021 and ChatGPT in 2022) and neural networks (like Midjourney in 2022). One configuration of these neural networks in known as a GAN (Generative Adversarial Network). This is a method of artificial learning where two neural networks try to trick one another to produce fake images based on real images. Working independently , a GENERATOR program trains to fool a DISCRIMINATOR program which results in an unlimited number of unique new fake images sourced from an original real image dataset. What was interesting to me is that with a GAN you could restrict the training data to a certain source of images only, unlike other artificial image processing systems that use text prompts to scrape the internet for previously existing images made by various artists and then recombine them in unique ways.

I first read about an approachable example of this technology in the online art magazine Hyperallergic.com.1 . Here the author described the project of artist Matteo Rattini who had trained a GAN to generate a series of ‘sculptures that didn’t exist’ from a dataset of photos from minimalist sculpture. I’ve always been curious about the inevitable relationship between the tools of technology and their affect on visualization and have consistently explored digital graphics tools alongside my traditional painting practice. I thought if this guy could figure it out, maybe I could too.

As a result of my social media posts, my collages already existed as computer files which were the perfect digital assets for GAN training. I was really curious about what would emerge if I trained a GAN using my own collage images as the dataset. Could a GAN function as a Digital Twin producing an infinite supply of convincing collage-like images derived from my own images and solve artist block forever? Would the series of fake collages show me a map of the inherent patterns in my process, my colour choices and compositions? What could these fake collages even look like?

Below is a rough diagram that attempts to explain this synergistic relationship between the GENERATOR and the DISCRIMINATOR. I confess that I am not a coder and can’t explain the intricacies of all the programming involved here. It’s kind of like the magic of most of technology- we learn how to use it but we can’t really build it by ourselves.

Artists have always been using tools and learning skills to depict the human experience. This cycle creates a feedback loop where the perception of the world changes with the incorporation of these new tools. David Hockney outlines this concept in his book Secret Knowledge2 where he theorizes that artists were using optical aids like lenses and camera lucidas to depict subjects in space as early as the 1400’s. Artists were not simply tracing- the point is that this technology enabled a new way of seeing, which ratchets up new possibilities for how we think and participate in the world. I think about how the the scientific revolution proceeded the establishment of point perspective and a mathematical conception of depicting space.

Like anyone looking to figure out how to do anything these days, I went on the internet and followed a rabbit trail of YouTube videos and GitHub posts to figure out the GAN training process. In the end, Jeff Heaton’s YouTube video3 explained how to navigate this technology complete with plug-and-play code using NVIDIA’s StyleGAN2 ADA. (Please note that this code no longer works- something about a update in the tensorflow code.) The only other things I needed were a Google Drive account and a pro subscription to Google Colab (you can’t train a GAN on your home computer), along with the help of my son who was taking some coding classes at university.

The series of trainings began over a few weeks during the spring of 2022. Because of the limits of computing time in Google Colab, the code would run for about 24 hours while periodically saving .PKL files known as ‘pickles’ and generating a series of iterative image sets known as ‘seeds’. When the computing time expired, training was resumed by starting up again from the last .PKL file. Below is a movie showing the initial series of seed sets as they were produced.

Through initial nebulas of chaotic colour, blobby shapes start to emerge similar to watching time-lapse cell division slides. Gradually these amorphous pixel blobs start to assume more definite shape and begin to resemble collage-like images. An interesting fact is that the GENERATOR can’t produce coherent results without the help of a noise image to stabilize the training procedure. (A noise image is just a randomly pixelated black and white image- kind of like a static signal in old tv sets.) It makes me think of creativity theories where you need to work with a bit of resistance or randomness to come up with new, different, stronger ideas- something to work against. The noise image provides a constraint to help manifest the images.

From a series of gradually improving iterations, there appeared something that resembled a collage by seed 160. Below is this first GAN image which was published via Instagram on June 7th, 2022.

Over several days I quickly gathered thousands of these fake collage images. Typically each .PKL file would produce a seed file consisting of 120 images at 512x512 pixels with similar composition structure and small variations of colour and shape between images. Each image looked proportionally familiar to me, and used shaped and harmonious colour combinations that I found interesting. They were all curiously mesmerizing. Individually each one was technically unique, but also they had a relentless sameness and an uncanny artificial slickness- like they were made out of plastic.

It reminded me of the the Ian McGilchrist book The Master and his Emissary4 . McGilchrist is a psychiatrist, neuroscientist, philosopher and poet who describes different aspects and roles of the left and right hemispheres of the human brain. According to McGilchrist, the left hemisphere has an attentional mode that is quite rational, reductive and controlling, whereas the right hemisphere intuits a bigger picture of relationships flowing in time and is aware of ephemeral qualities like emotions and beauty. His thesis is that our current culture is being increasingly dominated by the over-confident, sociopathic thinking typical of the left-brain “emissary”, and it is incumbent on our society to regain the balance and control of the more subtle and nuanced right-brain “master”. This can be witnessed by rise of industrialization and the monotonous impact of mass manufacturing on everyday life - where people are seen as units of production instead of embodied in a network of relationships. McGilchrist insists that the emergence of AI is an example of this left-hemisphere overreach and we should be vigilantly aware of how we are being formed by what we are paying attention to. He posits that AI is not intelligent in the true sense but instead is artificial information processing, and our efforts to expand its scope could reach a tipping point leading to domination by the powerful silicon embodiment of the left hemisphere with all its soul destroying biases.

There definitely does seem to be a will to digitize all facets of life that is creeping into every corner, and there has been a lot of alarm about AI images and the future for artists. First of all I think we need to remember that these systems function as tools first and that we hold the responsibility regarding regarding the parameters of their use. Ultimately our participation in this big and complex world can never be adequately explained or contained by binary systems of logical information processing. Secondly, these machines have a fundamentally parasitic aspect and will always need ‘real’ (human) data to generate comparative ‘fakes’. In fact when AI uses its own data to train itself, the results just get more and more unintelligible. Thirdly, the focus on the result- the finished piece- misses a huge aspect of why we create art in the first place; the tactile process of the making itself with all the awkwardness, the mistakes and the detours. And lastly, it ignores the spectator - who these images are for. Humans seem to have a profound need to observe other humans, either through direct mimicry or by indirectly gazing at work made by humans. While the proliferation of images produced by AI can be superficially interesting and enchanting, it can’t really compete with the depth of human-made art which may be all the more valued as we continually strive to create meaning in the world.

While we should resist submitting our artistic agency blindly to these algorithms, we also can’t ignore the potential of these new tools. Like any tool, this new technology could relieve artists from the more mundane aspects of production but it will not create anything new per se. Personally I’m still fascinated by the questions posed by just how these AI images are generated and find it interesting that by playing with this technology we are also revealing more about the mysteries about the complexity of navigating in the world and the miracle of consciousness itself. I think there is a lot that we can discover and understand about ourselves as humans by exploring these processes as long as we remain aware of the limitations of this technology.

At the end of the day, I don’t believe that training a GAN on a dataset of my own images revealed anything truly unique about my collage process, though it was like having a tireless digital twin that produced elevated placeholder images for me. (A placeholder image is a graphics design term where a temporary image is used as a substitute for the eventual placement of the actual image.) While the series of fake collage image sets did derivatively resemble my hand-made work, something about the relentless prolificacy of images also clearly illustrated the smothering and domineering aspect of the hyper-logical left-hemisphere mode.

And this is where it gets a bit weird. The suffocating aspect about the quick frictionless production of thousands of my own collage-like images paradoxically rekindled a desire to revisit slower, more traditional artistic practices and engage with messy art materials. I started to regularly attend open life drawing and maintain consistent sketchbook habits. I came to further appreciate the awkwardness of my own drawing, and issues of optical exactness were less important than the pleasure of the process and the poetry of the arrangement of colours, shapes and lines on a page.

We do live in a confusing and unprecedented time for art and artists. There has never been so many people making art with the ability to share it to a global audience via the internet. And there is now a constant pressure to produce a steady supply of work for the consumption of social media. Although I believe that this technology will never replace art or artists, it can function as a possible defence against the insatiable demands of these social media algorithms. The quick production of GAN generated images (trained from the unique dataset of an individual artist- a digital twin) could be utilized here as a sword and shield (attack and protection) by reserving a space for the artist in the new disembodied digital landscape, just as a placeholder image keeps space on a printed page for future use. As culture continues to roil along, it has always been the artists who feel compelled to respond, and in utilizing these new tools we must remember that technology is only a means to an end and should never become the end itself.

My daily hand-made art is posted on Instagram @reb.ott and on X @reb_ott. The GAN collage images are now bot-posted to a separate account on X @reb_ott_robot.

Valentina Di Liscia, ‘The Art of AI: Computer-generated Sculptures Are Eerily Real’ , August 3, 2021,hyperallergic.com

David Hockney, ‘Secret Knowledge’, Avery, 2005

Jeff Heaton ‘Training a GAN from your Own Images: StyleGAN2 ADA’ published Feb 17, 2021 via YouTube.com

Ian McGilchrist, ‘The Master and his Emmisary’, Yale University Press, 2019

Making art daily

Discussion about this post