Five years ago, electro-punk band YACHT entered the recording studio after spending more than 14 years mastering their music and painstakingly compiling the results into an album called Chain Trip.
YACHT member and tech writer Claire L. "I'm not interested in being reactionary," Evans said in a documentary about the album. "I don't want to go back to my roots and play acoustic guitar because I'm too scared of the robot apocalypse, but I don't want to jump in the box and greet our new boss robots."
But our new robot masters have made great strides in making music with AI. Although the Grammy-nominated ChainTrip was released in 2019, the underlying technology is outdated. Now, Stable Diffusion, the startup behind the open source artificial intelligence image generator, is pushing us to the next step: making music.
Create an agreement
Harmony Stable AI is funded by a London-based startup called Stable Diffusion. In late September, Harmony Dance released Streaming, an algorithm and tool manager capable of creating music videos by learning hundreds of existing songs.
"When I first started working with Stability AI, I started working with audio streaming," said Zach Evans, TechCrunch's head of broadcast development for dance, in an email. “The company hired me to work on [imaging algorithm] disc broadcasting and I quickly decided to move into audio research.Created by Harmony.
The dance show is still in the experimental phase; at the moment, the system can only generate clips of a few seconds. But the initial results raise doubts about what the future of music is and what effect it will have on artists.
Dance Distribution comes from OpenAI, the lab behind DALL-E 2 in San Francisco, designing a giant music experiment called Jukebox. Depending on the genre, artist and piece of text, the jukebox is able to produce music that is fairly consistent in sound. But jukebox songs lack elaborate musical structures like repetitive choruses and often have meaningless lyrics.
Google's AudioLM, which debuted earlier this week, shows more promise, with an impressive ability to create piano music from a short game snippet, but it wasn't open source.
Dance Distribution aims to push the limits of previous open source tools by borrowing technology from image generators like Calm Distribution. A system that learns to delete and restore many existing data samples to create new data (such as songs) is called a diffusion model. Once installed on existing examples - the entire Smashing Pumpkins discography - the model will be enhanced to recover all previously destroyed data to create new works.
Kyle Worrall, Ph.D. A student at the University of York in the UK studying musical applications of machine learning explained aspects of delivery systems in an interview with TechCrush:
"When training a diffusion model, training data such as the MAESTRO piano performance dataset is 'destroyed' and 'reconstructed', and the model gradually improves these features as it iterates through the training data," he said in an email to his form. in music (like a piano performance in the case of the MAESTRO). Users can use the trained model to perform one of three tasks: create a new sound, recreate an existing sound selected by the user, or interpolate between two input tracks. »
It's not a very intuitive idea. But as DALL-E 2, Stable Diffusion and other similar systems have shown, the results can be surprisingly realistic.
For example, listen to this album remix of the Daft Punk song:
Or transform this style from Pirates of the Caribbean theme to flute:
Or here's a cool (yes, really) version of Smash Mouth's Tetris song:
Or these patterns are set to dance music:
YACHT's Jonah Bechtolt is amazed at what a dance show can do.
"Our first reaction was, 'OK, that's a step forward in terms of raw sound,'" Bechtolt told TechCrunch.
Unlike popular imaging systems, dance streaming is limited in what it can do, at least for now. Although it can be tuned for a specific artist, genre or instrument, the system is not as versatile as a jukebox. Available dance show templates — including Harmonie and Early Adopters on the official Discord server, Billy Joel, The Beatles, Daft Punk, and Song A Day's Jonathan Mann Project pieces — continue to run their course. However, Jonathan Mann's model always produces songs in the style of Mann's music.
And the music created by the dance show is no longer fooling anyone. While the system can transfer styles by applying one artist's style to another's song and creating a cover, it cannot create clips and lyrics longer than a few seconds (see clip below). That's a result of the technical hurdles Harmoni still has to overcome, says Nicolas Martel, a self-taught game developer and member of Harmoni Decord.
“Because the model is only trained on short samples of 1.5 seconds, it cannot learn or infer any long-term structure,” Martell told TechCrunch. "The writers seem to say that's not a problem, but in my experience - and at least logically - that's not entirely true."
YACHT's Evans and Bechtolt are concerned about the ethical implications of AI—they are workers, after all—but note that such "style transfers" are already part of the natural creative process.
"It's something that artists are already doing in the studio, but in a very casual and informal way," Evans said. "You sit down to write a song and say, 'I want an autumnal bassline and a B-52 melody, and I want it to sound like London in 1977.'"
But Evans isn't interested in writing a gritty post-punk version of "Love Shack." Conversely, he believes that interesting music comes from experimentation in the studio; even if inspired by the B-52, your final product may not have those influences.
“If you try to find that, you fail,” Evans told TechCrunch. "One of the things that got us into machine learning tools and artificial intelligence is that these models fail because they're not perfect. They only predict what we want them to.
Evans describes artists as "the best beta testers" who use tools in an unconventional way to create something new.
"A lot of times the results can be really weird and confusing and frustrating, or they can feel really weird and new, and that failure is fun," Evans said.
It seems inevitable that serious ethical and legal issues will arise if we believe that dance streaming will reach the point where it can generate entire songs. They already have some, but around simple AI systems. In 2020, Jay-Z's record label filed a copyright lawsuit against his YouTube channel for using AI to create covers of songs like Billy Joel's "We Didn't Start the Fire." After the videos were initially removed, YouTube reinstated them, saying the download requests were "incomplete". But deepfake music still lacks a clear legal basis.
Perhaps to avoid legal trouble, OpenAI released Jukebox under a non-commercial license that prevents users from selling music generated by the system.
"There is little work to prove the originality of the results of creative algorithms, so using creative music in ads and other projects always carries the risk of unintended copyright infringement and property damage," Worall said. "This area needs more research."
A scholarly article by Eric Surray, now a legal fellow at the Music Publishers Association, argues that AI music generators like Dance Stream violate music copyright law by "creating a coherent sound recording from the works they interfere with." States." Copyright Law. After the release of Jukebox, critics questioned whether it was fair to train AI models on copyrighted music material. Training data used in AI systems that generate images, code and text are often unknowingly pulled from the Internet around Adoption.
Technologists like Mat Dryhurst and Holly Herndon created Spawning AI, an artificial intelligence tool built by artists for artists. One of their projects, I'm Trained, allows users to search and view their work without their consent that was included in an AI training kit.
"We're going to show what's available on common datasets used to train AI imaging systems and give them the tools to initially opt-out or opt-in to training," Herndon said in an email to TechCrunch. We talk to many large research organizations to ensure that consistent data is relevant to everyone.
But these standards are - and probably will be - voluntary. Harmon didn't say he would take it.
"To be clear, it's not a broadcast product and it's just research at this point," said Zach Evans, Serenity's AI. "All publicly released models within Dantza Broadcast are trained using public data, Creative Commons-licensed data, and data from artists in the community. Only a voluntary method is used here, and we look forward to working with artists to expand our data through additional voluntary contributions." I admire the work of Holly Herndon and Matt Dryhurst and their new company, Spawning.
YACHT's Evans and Bechtolt see parallels between AI and other emerging technologies.
"It's really annoying to see similar patterns, especially in diploons," Evans told TechCrunch. Something happens when tools and platforms are developed by people who don't think about the long-term and social impact of their work.
Jonathan Mann — whose music was used to teach one of the first broadcast dance patterns — told TechCrunch that he has mixed feelings about AI generation systems. While Harmony claims to be "prudent" about the data it uses for training, others like OpenAI are less diligent.
"Jukebox has trained thousands of artists without their permission - it's terrible," Mann said. “It's weird to use a jukebox when you know so many people's music has been used without their permission. We are in an unknown territory.'
From a consumer perspective, Waxy's Andy Baio noted in his blog that new music created by AI systems would be considered original work, in which case only the original elements would be copyrighted. Of course, it is not clear what can be called "original" in such music. Anyone using this music commercially is entering new territory. It's easier if the resulting music is used for purposes protected by fair use, such as parody and commentary, but Baio expects the courts to decide on a case-by-case basis.
According to Herndon, copyright law does not adequately regulate AI-based artistic creation. Evans also points out that the music industry has historically been more contentious than the world of fine art, which explains why dance streams are created without datasets or predefined objects, which the DALL-E mini spits out as Pikachu enters them. Meanwhile. The word "Pokémon".
"I don't think it happened because they thought it was the best moral," Evans said. "That's because music copyright is enforced more strictly and aggressively."
The ability to create
Gordon Tuomikoski, an art student at the University of Nebraska at Lincoln and moderator of the major stable distributor Discord, believes that dance broadcasts have great artistic potential. He notes that some members of the Harmony Server have created patterns trained on "online" dubsteps, kicks, snare drums, and backing vocals and incorporated them into original tracks.
“As a musician, I see myself using something like a dance floor to sample and loop,” Tuomikoski told TechCrunch via email.
Martel predicts that dance streaming will one day replace VST, the digital standard used to connect synthesizers and plug-in recording systems and sound editing software. For example, he says, a model trained in jazz rock and '70s Canterbury music brings new "textures" to the drums, such as subtle drums and "ghost notes" that artists like B. John Marshall shun. Engineering skills are often required.
Take this Senegalese drum dance show, for example:
And this necklace pattern-
And this male choir model sings in the key of D in three octaves.
And here's an example of Who's Song Polished for Free Dance Music:
"Usually you write notes in a MIDI file and create a very sharp sound. Getting a human sound like that takes not only time, but also a deep understanding of the instrument you're designing," says Martel. The orchestra plays Pink Floyd, Soft Machine and Genesis, I can't wait to bring you a billion new records in a variety of styles, introducing Apex Twins and Vaporwave, all at the peak of human creativity, all in collaboration with your personal tastes.
Who has big ambitions? Today, he uses a combination of jukebox and dance streaming to play with generations of music and plans to release a tool that will allow others to do the same. But he hopes to one day create a "digital version" of himself that will allow him to continue the "Song A Day" project after his death, perhaps along with other systems.
"It's not really clear yet what it's going to look like... [but] thanks to everyone at Harmony and the other people I've met at Discord Jukebox, I feel like we've made more progress in the last few months than we ever have." Song A Day's 5,000 I have more than one song, their lyrics and rich metadata, from mood, genre, time, location and beard (I didn't have or had a beard when I wrote this song) With the data, we can create a model that can reliably generate new songs that I've written myself, songs one every day but forever.
If AI can successfully create new music, then what about musicians?
YACHT's Evans and Bechtolt say new technologies have already revolutionized the art scene and the results have not been as dramatic as expected. In the 1980s the British Union of Musicians tried to ban the use of synthesizers, arguing that it would displace musicians and deprive them of their jobs.
"With synthesizers, many artists have embraced this novelty and instead of rejecting it, they have invented music in the style of techno, hip-hop, post-punk and new wave," said Evans. "The revolution is happening so fast that we don't have time to digest and understand the impact of these tools and understand them."
However, YACHT fears that AI could eventually challenge the work of musicians in their day-to-day work, such as writing music for commercials. But like Herndon, they believe that AI cannot fully reproduce the creative process
"To think that artificial intelligence tools replace the meaning of human self-expression leads to a mismatch and a fundamental misunderstanding of the function of art," Herndon said. "I hope that automated systems will raise important questions about how little we as a society value art and journalism online. Rather than speculating about alternative narratives, I prefer to see this as a new opportunity for people to re-evaluate.'