AI & The Music Video

Back To Posts
Mark Dixon
Wednesday, June 28, 2023

‘Til Death do us Part: AI and the Music Video

It’s 2023 and everyone is worrying about artificial intelligence (AI) taking over all aspects of human existence — destroying civilization as we know it. As of this writing, the robot apocalypse has yet to materialize, but there are indications that dramatic changes are right around the corner. The music video is no exception, with producers using animation created by a variety of software programs such as DeForum, MidJourney, Stable Diffusion, and Kaiber. The results have been mixed. One thing is certain, everything is changing very quickly.

In 2007, David Byrne wrote an article originally for Wired magazine titled, “David Byrne’s Survival Strategies for Emerging Artists and Megastars,”  in which he analyzed changes in the music industry that left new artists to fend for themselves. No longer did record companies fund recording sessions, manufacture, distribute, or market their product. One of the requirements for new musical artists to promote their music was to appeal to MTV — for those too young to remember, MTV was a channel that played nothing but music videos.

Although this article is ancient by Internet standards, the advice is still sound. Just ask any independent musician and the advice here is old news. For most artists, fending for themselves in the modern music industry this advice is a way of life. As Byrne said, the support services were no longer needed. Recording costs dropped to nearly zero with the proliferation of computerized recording and editing software. The Internet became the only manufacturing and distribution tool needed to get an artist in front of the public. Online streaming services proliferated, allowing the public to hear new music. Now artists need to ensure that their music is on every streaming service — and there are more than just a couple services. Marketing their music is ongoing and just another part of being a musician in the 21st century.

Regarding the music video, Byrne quotes Merge Records cofounder Mac McCaughan, saying, “The bands we work with, we never recommend that they make videos. I like videos, but they don’t sell a lot of records.” Does this really reflect the world of modern music marketing? So many factors need to fall into place to sell a lot of records. (Also, hardly anyone sells “records” anymore.)

Advance to the present day: as with recording the music, making a video became easier and cheaper. With YouTube, a musician could get their own channel and upload a video to accompany their latest song. Cost was minimal or nonexistent, especially with a smartphone. But for the most part, it isn’t the bands making music videos these days. It is the fans making the unofficial music video.

When Byrne wrote his article, the fan-made, unofficial music video was already pervasive on YouTube. (Note: Don’t try doing this with an old, top-selling groups from the 60s or 70s. The record labels guard their ownership of this history fiercely.) Mostly, they snuck under the DMCA copyright provisions by being student projects and using them for “educational purposes.” As copyright structure became more forgiving, the unofficial video was recognized as a way for new musicians to market themselves — even if they don’t get involved. The fan-made video was born.

Making an unofficial music video requires some degree of knowledge, such as how to use a camera and editing software. The smartphone allowed everyone to have this capability in their pockets. This lead to untold thousands of unofficial and fan-made music videos being uploaded to YouTube.

TikiKiti was created from this realization.

Now it's all about to change — again. Brought to you by artificial intelligence, or AI. For little or no money, a musician can make their own music video, upload it to YouTube, share it on all the social media sites, and have their music become more recognizable. At least, that’s the promise. The reality is a bit hazy.

Earlier this year, when AI became the hottest news item, the art community was not sure what to think. Ownership and copyright issues were all that that news outlets focused on. However, it seems that musicians and video producers ignored all the bad press and moved forward to experiment with different applications.

Once was enough (for now)

In November 2022,  Anssi Puisi, (aka Kinomo) from Finland, produced an AI video to “We Don't Know What Tomorrow Brings,” by The Smile. He entered this video in the TikiKiti International Film Festival last year. I decided to contact him and discuss this production. Early on in our interview, he admitted that he wouldn’t be much help. “My enthusiasm for AI animation pretty much stopped after making that video, at least for the time being,” he said. “Maybe one day when the technology improves, I might get excited again, but for now, fighting with AI video feels mainly frustrating.”

This is an opinion we found prevalent in producers we have spoken to.  AI is another tool in the toolbox, and they recommend that everyone try an AI application and see for themselves. “But the experience of art requires that the artist has the vision and inner fire to make that kind of work,” Anssi says. “It requires experiencing human connection. Buying AI-made art would be a bit like having a romantic relationship with a robot. You'd probably be left feeling a bit hollow.”

TikiKiti sees many AI music videos—we can attest to how they all start to look alike.

Making videos with AI is very time-consuming, Anssi says. “Also, the limitations of AI come into play very quickly if you try to do more than just the most obvious images that AI can do well — and that the world is already full of.”

But he says that it’s important to learn about it, if only in self-defense. “If you're in danger of losing your job to AI, the best solution is to start using AI yourself, alongside other tools.”

He acknowledges that AI already has at least one unique feature: it can create never-before-seen images and new unprecedented effects. “However, the novelty wears off, and in the future AI glitches will be curiosities like the video echo effect in 80s music videos.” (Note: TikiKiti still sees many “infinite zoom” videos that we hardly bother with.) “I don’t mean to say AI is not a revolutionary and game changing step,” he says. “It is. Any objection is completely pointless. It is inevitable that AI will take its place in all fields of production.”

Creativity magnified

Producer Matt Balshaw (aka Prompt Engineer) revealed similar sentiments in our interview. He says that the reaction that AI is experiencing is like the reaction when Photoshop was first released. “The tools we have now for AI art are roughly similar to photography in the 90s. Anyone can buy an instant camera and take photos that are not bad.

“We are seeing extremely rapid development of the AI tools though, and I think by the end of 2023 it will be easier than ever to get beautiful images from prompts, like how it’s easy to get a great photo with an iPhone.”

Matt is a fan of AI.  “Personally, I really love what these AI tools are allowing us to do. I've never done much drawing or art, only a little bit of photo and video editing. With these AI tools, I've had the opportunity to be extremely creative with my music videos and create content that never would have existed without them.

“I would imagine that this phenomenon is not unique to me, and that we will see an explosion of artistic expression in the coming months from people who might otherwise have never attempted it.”

Matt notes that there are myriad legal issues that will need to be addressed while allowing the technology to advance.  Copyright laws are at the forefront of most AI challenges. As Matt says, “I think that there are legal issues with training AI models, especially on stock images that are not meant to be publicly available. Watermarks often appear in stable diffusion generations, and this is a clear sign that watermarked images were used in the training data. Overall, I think that if you have made an image public to be viewed by anyone on the internet, it is also public to be used as training data in a large image model. The courts will certainly be deciding if AI training is fair use, but I don't know when that will happen.”

AI has simplified creating animations. However, if you want to create something that stands out from the crowd, there will be a steep learning curve, Matt says.  “It took decades to develop photography to the point where anyone could take a photo, but it only took months to go from only experts creating AI images to anyone being able to create AI images. What has not changed, though, is that there are still professional photographers who are able to take much better photos. The same will always be true of AI-generated images, because even though it may only take seconds to create an image, it takes creativity and vision to know what image is ‘good’ and then skill to get that particular image.”

Putting in the time

Here at TikiKIti we have been learning as much about the AI music video as we can — as fast as we can. Everyone keeps asking, “How do we rate these videos — how was it produced?” The definitions we have used since the beginning were no longer obvious. All production quality looked good; how creative was it really, especially if it was AI generated; and what about editing — does it even play a role with AI videos?

It was our discussion with Patrick Hanser, of the Brazilian grunge-rock group Bicará, that really helped put all this new technology into perspective. Patrick has worked for a video production company — Spray Media — for about seven years.  At the time of our interview, he was working in Buenos Aires on a project for Netflix Brazil.

Patrick designed and produced the AI music video for his latest song, “Cores.” He confirmed one thing every producer we spoke to said — that creating a quality music video using AI tools will likely take a long time.

He had never worked with animation before, but AI made it possible. He tried inputting some existing video and prompting the application. It ended up looking much like everything else he had seen produced with AI technology. “I feel my method was more crafted because I made and edited a full music video before the actual animation,” he says.

He says the editing must come before the AI is prompted. All the production elements in his video, such as camera movement, were planned out before ever going to the AI application. AI is the final processing piece.

This is not to say an AI music video cannot be made without the control Patrick had over his project. YouTube is full of AI-prompted music videos that took only a day to produce. It was Patrick’s vision and preproduction planning that created a quality video. The actual video only had 14 shots. As with any production, he knew just where to put each shot and had to make sure it worked with the AI.

After realizing how AI easily created photo-realistic images, he knew he had to try something different. People who are new to production, and AI in particular, tend to use realistic prompts, he says.  We have all seen the images of political figures and other photos done by AI. We have all seen where AI get these images wrong (such as hands with too many fingers). This—and making these realistic images do crazy things—is just the fascination with new technology that will wear off in due course.

Patrick finds this photo-realistic esthetic boring. Fond of works by Picasso and Kandinsky, he was able to create something more timeless. He knew he wanted to have a look and feel similar to their styles. This he accomplished with his very specific prompting. “My biggest goal was to make something that did not look like AI,” he says.  To get the result he was looking for took a lot of trial and error. Starting with over 10,000 frames, he knew he had to pare it down to around 3,000. His method required watching and rewatching each of his edits, getting closer each time to his complete vision — getting better at prompting and selecting frames, always going back and re-editing, making everything uniform and cohesive.

Experienced in video production, Patrick has been part of many productions.

The video took him about six months, while he worked at his job, travelled, played in his band, and lived his life. He says, “If I were working 8 hours a day, in one go, it would have probably taken a whole month.”

“As artists, first we need to know what’s being done in whatever medium — and all the references I found were for creating the photo-realistic esthetic. So, my task was to figure out how I could do something different.”

Just before we signed off from our interview, Patrick said something that defines any art: “If it’s good — it’s good,” he says. “Whether you spend ten minutes or ten years, if it touches me somehow, it doesn’t matter how it was done.”

But he does have concerns about the technology. “What worries me as a creator, regarding Stable Diffusion and Next Gen, is that it’s really easy to make. You literally press a button and upload a video.” It is this ease that makes so many videos look the same.

Prompting is art, he says. It’s hard to get the results you want. “I did this frame by frame. Just the fact I chose each frame makes it more human. If you really put time into it — and have a vision — you can craft something that is amazing.”

The complete interview is also on YouTube here.

Why the music video?

What Mac McCaughan of Merge Records wasn’t aware of back in 2007 is what the music video would become — a mature art form unto itself. Unlike the old MTV music videos, which were primarily a marketing tool, they are now an expression of the artist or the video producer — or both. The music video format has become the assignment of choice for many high school and college students in media production courses all over the world. They are short stories in which the student has a chance to express themselves in a way not usually given to them. Music has become so personal to the listener, and the music video has followed suit.  These music-video assignments have introduced many would-be professionals to the world of visual storytelling.

The unofficial video was, at first, a video that a fan would produce to a band they loved to listen to. Now, with the advent of AI, musicians can produce their own videos. Lately, the unofficial video is produced by the musicians without approval of a manager or record label. These videos are being done because the musicians want to take as much control over their musical career as possible.

TikiKiti doesn’t see the “unofficial video” label as much as we used to. Bands such as Bicará will produce their videos on their own. They will be just as “official” as any produced by a record label. AI helps eliminate the expense of hiring a team to produce them.

What is there to fear?

Fear: the motivation many need to enable personal growth.

When I first began thinking of writing this article, I recognized how bleak many of the AI video images were (landscapes from alien worlds; people morphing into machines). When AI started to flood our video rating queue, there were few memorable videos. Then we saw one of the first AI videos that was also a narrative. It was the video by Rob Level called “Dear Mom”  —  a video telling the story of his abuse by his mother when a child. “I took days choosing the perfect keywords to generate literally over 1,000 videos and images for the 'Dear Mom' project,” he wrote in his comments in the video description. “Because you watching this means so much to me, I made sure the imagery would give you the most emotion evoking experience so you can feel the music and lyrics to connect with the pain of the song.”

I recently read an article in The Guardian by Matthew Cantor where he highlighted several graduation speeches by famous personalities. The one that stuck with me was from Patton Oswalt, where he says, “You are about to enter a hellscape where you will have to fight for every scrap of your humanity and dignity. You do not have a choice to be anything but extraordinary. Those are the times you’re living in right now.”

I realized that most of the producers whose videos I am seeing already know this.

I’ve discussed the personal vision of three music video producers from different parts of the world. They each produced something unique, showing how using AI for the music video requires a great deal of skill. In the process, each portrayed their vision of the world.

Matt Balshaw shows us a bleak, post-apocalyptic world with his AI-produced video to Green Day’s, “Boulevard of Broken Dreams.”

Anssi Puisi sees the world as it is, in contrast to how he wants it to be. His video for The Smile and their song “We Don't Know What Tomorrow Brings” shows images taken from the modern world. AI brings them together for us to see and ask ourselves, “If this is the world we have now, what kind of world will we have in the future?” Overcrowded. Polluted. The list goes on.

Patrick Hanser doesn’t bother with the future — or even the present. With his video to his song “Cores” (“Colors”) his artistic statement shines through. Patrick does not see the world through rose-colored glasses. He just chooses to show how it can be — with his art. Through hard work and dedication, and, above all a vision of how to use this tool, AI can help bring about a renaissance between art and technology. The future is the hands of those with the dedication and skill to use this new tool. Each of the artists highlighted in this article had a very concise artistic vision of what they wanted to create with AI.

It’s easy to imagine that this generation who will be the one lead us away from the edge of the cliff.

Disclaimer: The article was written and edited entirely by humans.

We would like to thank the following for their help with this article: Our editor at TikiKiti Rosemary Camozzi (she makes me sound like a grown up): Matthieu Delattre; Director’s Notes (for introducing us to Patrick Hanser); And AI music video artists extraordinaire Anssi Puisi from Kinomo, Matt Balshaw, and Rob Level.

< Back to Posts