Here I am at age 61, by the measure of someone under 40, an old person, and for me, that means I’m part of the group of people who generally suck. The people who are averse to change, stuck in routines, creatures of habit, and detached from the zeitgeist. This is mostly true of those I encounter who have pushed into the mid to upper 50s and beyond, but obviously, not all.
Today, I became the 810,701st person to join the Pika Discord channel, though I’m the 75,180th member on Runway’s channel. What this means is that I’m late to the game. What game is that? It’s not the Olympics, baseball, football, or any other sport the masses obsess about. Pika and Runway are early leaders in the artistic text-to-image and image-to-image artificial-intelligence-driven generative video creation. These tools require me to refamiliarize myself with the processing side of things, looking at the next generation of GPUs from NVidia, NPUs from Intel and Qualcomm, and LPUs from Groq, promising to accelerate our race into AI. Then there’s the terminology such as LoRA (Low-Rank Adaptation), Diffusion Model, VAE (Variational Autoencoder), Checkpoints, GAN (Generative Adversarial Networks), ControlNet, NeRF (Neural Radiance Field), GPU Clouds, and new terms are emerging all the time.
Inherent complexity is certainly at the center of this emergent field of creativity, with pundits on the side lamenting whether machine-generated art, writing, music, or video even qualifies as such. That part of the conversation is a non-starter as it negates the fact that iterative stages of all tools hold the potential to disrupt the comfort of those who’d prefer to maintain the status quo. I believe the alarmist side of the story is most appealing to the elderly, who fear change and are reluctant to experiment with things. They fear embarrassment if they are not adept at bringing on new skills.
Along the way, I contend with those who want to insist that AI is a zero-sum game for humanity and will either fail or enslave and destroy us instead of simply being another tool that enhances how and what we people do. I suppose if one looks on from the observer’s point of view and listens to the talking heads trying to entice one’s senses to hysteria (since that’s what pulls the masses in), one might easily believe there is nothing of any lasting value in the scary futuristic world of human irrelevance. But if those same people could peer into an intricate node network of ComfyUI harnessing the community-driven tools of image manipulation or tune into trying Claude’s Sonnet, Meta’s Llama 3.1-405B, or Mistral Large 2 asking about the intersection of ideas between Thomas Pynchon, C.S. Lewis, and Oswald Spengler to see what thoughts these AI’s might inspire them to consider, they might see that humanity is opening a window to a deeper knowledge that could move culture forward in profound ways.
Almost daily, there are advancements in tool evolution regarding video, music, writing, research, vision systems, medical diagnosis, and other areas of augmenting intelligence and pattern recognition that benefit from deeper thinking, just as the mass of humans would be doing if they, too, were exploring complicated systems instead of banal entertainments that absolve them of stepping into the minefield of potential failure of comprehension.
But why am I picking on the elderly? Because I want co-conspirators in this exercise to fight against intellectual lethargy by turning over these brain cells in an attempt to maintain a semblance of plasticity in this aging gray matter in my head. Then, maybe instead of hearing their stupidity put on display in public as they speak of the dumbest shit imaginable, I’d be able to dip into their conversations and have them drop knowledge into my hungry mind. I have to thank social media and all of its ills for creating connections with those who are at the frontier of discovering and playing with things that are the furthest away from simple and easy. I’m not saying I always want to be mired in the trenches of difficulty, but the Marvel Universe, various television series, celebrity relationships, and political shenanigans are nothing more than distractions absolving the populace from advancing themselves.
At this moment, I’m in the discovery process of learning ComfyUI. The basics are starting to make sense, but only slightly, and now I can decipher what the image notes at Civitai mean when they reference which Checkpoint version the artist is using, the prompts, LoRA trigger words, the sampler used, how the seed lends variation and just what kind of time and broad thinking is being invested by these artists. Demons, fire, nymphs, buxom anime girls, cyborgs, and tons of fantasy stuff are abundant and grab the attention of many, but some incredibly intricate and seductively beautiful works of art start to shake the obvious AI influences. When I watched my first tutorial about ComfyUI, I thought it was complex. Now I recognize that the basics were just that, and there is a universe beyond those starting points that boggles my mind. On the one hand, I’m overwhelmed, while on the other, I know that as the pieces come together, these times where infinity entices me to go further will leave me wondering why I ever thought any of this was as difficult as I wanted to imagine.
I’m trying to say that the excitement is palpable, but the coherency of the objective must be kept in focus. With so many moving pieces in the intellectual process that’s driving activity, it’s often difficult to balance my interests. The initial thrill driving these explorations will fade, though I hope that should I acquire any new skills, I can utilize them to complement my current output. Regarding the prompt I utilized to have ComfyUI help me create this image, I’m at a loss as to how the mind of AI used its skills. I should also point out that all these images, except the last one, were created over the first week that I began learning to work with Stable Diffusion via the ComfyUI software, except for the last image of the blue mountains, which was made with Krita tied to ComfyUI.
And what about my original premise that old people suck? It’s not because they are old. I’m also old. It’s that, by and large, they are incredibly boring, stuck in routines I cannot understand. Someone recently asked me, “Do you have any friends?” Without skipping a beat, I told this person, “No, not really in the way I’d call someone a friend.” I explained the difficulty of being an outlier who can’t share small talk about television, movies, sports, cars, guns, the gym, my children, or investments. Many of the older people I talk with are retired or are working because they don’t know what to do with their time. I’d guess that they are bored with television, movies, sports, cars, guns, and the gym, but it’s the only life they’ve known aside from being parents or being a reflection of their careers. The problem is vanity and pride stop them from attempting to learn things where they’d risk showing themselves as amateurs. Instead, they’d rather remain in their lanes of superficial knowledge where they’ve gathered friends stuck in the same rut.
This post was growing as days went by where I’d not prepped the images for it due to various distractions, mostly AI stuff, and in the time since I began writing this, I’ve begun to understand how to paint in Krita with the help of ComfyUI and how to work with ComfyUI in Photoshop instead of being restricted with Adobe’s Firefly implementation. This is significant as a couple of weeks ago, I felt I wouldn’t be able to run these tools on my current laptop with its Nvidia RTX 3050 Ti and 4GB of VRAM, but while it’s slow, it works and has opened my explorations of the technology until I can acquire an RTX 5090 at the end of this year or early next year which will truly allow the capability of these complex interconnections to take flight.
I’ve lived through many milestone moments in the evolution of the personal computer industry, starting when the very first computers for consumers were sold. Then, in the 1980s, the first ideas of how these devices would harness multimedia gathered steam, such as when DPaint and Imagine 3D were released for the Amiga along with Desktop Publishing software called Quark on the Apple. Then, in the early 1990s, ProTools, Windows 3.0, Photoshop, and 3D Studio were catapults, but before being able to leave the 20th century, Windows 95 and the internet browser would change the world. Things stagnated for a bit, but with Windows 7, Adobe’s Creative Cloud, and smartphones, we were again being launched into a new world of the digital arts, with social media making its mark. A blip of virtual reality that went nowhere, along with blockchain technologies that are still widely misunderstood, came onto the scene. Today, AI is controversially evolving. Once again, we are at the cusp of a monumental shift when entire subcultures still outside the mainstream are adopting new technologies and language that will drift into common usage in the coming years. Still, for now, it is the bane of those who’ve heard the fear-mongering on the edges of this incredible technology.
Having lived through these multi-generational changes since the 1970s, I’ve listened to the frightened yammerings of those afraid of great change, but here I am, fortunate enough to be alive to witness yet another seachange regarding the tools humanity has brought to bear. Not only do I get to watch the shift, but I’m also able to dabble with it all, maybe because I’m not too old and my level of suck hasn’t yet reached its zenith.
Addendum: Between the 21st and the 23rd of August, I learned more about the creation of LoRAs, but I’m leaving for vacation and won’t be able to focus on learning the process yet. When I return, I’ll have to find time to train a few, one on old family photos, another using the images we created when we were living in Germany making record and CD covers, and then one focusing on our travel photos, maybe one from the concert videos we shot back in the 1980s for that 4:3 grainy look of old TV.