- Google has announced that it has developed an AI-powered bot that generates music based on text descriptions.
- The technology will not be released due to existing technical issues and risks.
In the race for artificial intelligence, Google has announced that it has developed a bot that turns text requests into music, but it doesn't expect to be able to use it anytime soon.
In research released Thursday, Google researchers described MusicLM as "a model that generates high-fidelity music from textual descriptions such as 'a soothing violin melody accompanied by a distorted guitar riff.'"
“We demonstrate that MusicLM can be both lyrical and melodic, transforming pleasant and loud melodies in the style described into a lyrical track,” the paper reads.
According to the study, users can enter descriptions such as "catchy jazz track with unforgettable saxophone solo and lead singer" or "90s Berlin techno with powerful bass and kicks" and receive the corresponding results. Similar examples shared on Google's Github page show proper audio for these commands.
MusicLM's debut comes amid the rapid rise of OpenAI's ChatGPT chatbot, prompting Google to issue a "Code Red" — which The New York Times in December described as "like turning off a fire alarm" for the tech giant.
To stay competitive, the company is accelerating the release of 20 new products, as well as a version of Google search with AI chatbot capabilities, according to the Times.
However, Google has said it has no plans to make MusicLM available to the public, citing a number of risks including programming bugs that could lead to a lack of cultural representation and appropriation, technological flaws, including "potential misuse of creative content" . "
According to the study, identifiable extant songs were found in about 1% of the samples, indicating possible copyright infringement.
“We emphasize the need for further future work to address these risks associated with music production – we do not plan to derive models at this time,” the study reads.
The study also points out the existing limitations of the technology, including the use of disclaimers and the timing used in text requests, as well as voice quality. Next, the researchers said they wanted to work on "modeling high-level song structure like intros, verses and choruses."