Episode 4: “AI and Music”
Listen to this episode:
Episode description:
In this episode I ask how artificial intelligence could continue to push the boundaries of how we enjoy music. I also review some of the generative AI music tools currently available.
Beatoven
https://www.beatoven.ai
Soundful
https://soundful.com
Soundraw
https://soundraw.io
Musico
https://www.musi-co.com
Riffusion
https://www.riffusion.com
Emergent Drums by Audiolab
https://audialab.com
Chord AI
https://www.chordai.net
Staccato
https://staccato.ai
Maroofy
https://maroofy.com
Playlist AI
https://www.playlistai.app
Episode script:
Hello everyone. My name is Scot and I’ve been sharing what I’m learning about generative artificial intelligence. I’ve been taking a deep plunge into these revolutionary new tools, and trying to distill down the practical information for others. I hope you’ve found it helpful.
Music, both playing it and experiencing it, is an ancient human ritual. And though technology has affected music through the ages, at its core, there is a timeless magic. The full power isn’t necessarily unleashed with each musical encounter, but it’s there, ready to help create some of the most powerful memories and pleasurable moments of a lifetime.
Thomas Edison invented the first practical sound recording and reproduction device, the phonograph, in 1877. The audio quality was crude by today’s standards, and for over 100 years technologists would continue to improve recorded audio by addressing each stage of the process: from the recording equipment and microphones, to the physical storage media, and the speakers and playback devices. By the 1960’s, with enough money invested in a high-fidelity system in the right room, you could sit with your eyes closed and experience music at a quality that could finally compete with live performance.
In the 1980’s advancements began in categories other than quality: portability like the Walkman as well as increased selection. These changes were again revolutionary, and affected human behavior on a massive scale. So where can artificial intelligence take us from here, now that practically all known recorded music is at our fingertips?
Before touching that question, let’s take a moment to consider a different one. Why did vinyl records never die, as other physical formats perished one after another? On paper they were set for extinction along with 8-tracks, cassettes, and compact discs. They’re clunky. They’re not very portable. You can’t listen to them in your car. What’s the deal? Why is it that there is simply nothing like relaxing on a Sunday morning, while flipping through and playing records? Or any time of day really? It’s an absolutely amazing feeling you can’t touch with a smartphone. How is it that exploring an extremely limited local catalog, and even having to get up every 12-15 minutes to flip over the record, is often a more enjoyable experience than unlimited choice and portability?
And this isn’t just a nostalgia thing; there are kids that are into vinyl. Nor is it just an infatuation with the warm sound that playing records offers. Plenty of vinyl enthusiasts don’t have an elaborate hi-fi setup, they just plain dig playing records.
I was born in the 1970’s and basically grew up in the 80’s. For a long time, it was common to walk into someone’s home and behold shelving full of plastic physical media. VHS tapes. Compact discs. DVDs. It was… not particularly attractive. And though I suspect the reason vinyl has survived is due to several reasons, perhaps one of them is because it is relatively easy to store them openly without destroying the aesthetic of a room. This is the same with books, the undisputed heavyweight champion of surviving physical media.
In 2019 I wrote a blog post on Medium and LinkedIn that I titled “Wanted: Hitchhiker’s Guide to the Music Galaxy” in which I pleaded with companies like Apple, Google, and Amazon to improve assistants like Alexa and Siri with respect to how they handled music requests. I was looking for an AI that would act like a docent in a museum and guide humans on their musical day-to-day journey, improving our quality of life.
Things have improved a bit since then. If I ask Siri something like, “Play something with jazz guitar on Spotify” – unlike in 2019 where I would get a response like, “I’m sorry I can’t do that” – now Spotify does open and play a jazz piece that contains guitar. If I ask Siri, “Play something with killer drums on Spotify,” – it opens the app and plays a track from an artist’s album that has the exact title “Killer Drums.” We still have a long way to go.
And I still want that music concierge. AI is finally at a state that a high quality one could be created, with a little effort.
I recently started a session with ChatGPT like this:
Pretend you are a music service. What track or tracks might you offer me if I ask something like "please play something with jazzy guitar"?
It’s response was:
If you're looking for a song with jazzy guitar, here are a few popular tracks you might enjoy:
"Take the 'A' Train" by Duke Ellington and his Orchestra
"Mister Magic" by Grover Washington Jr.
"The Girl from Ipanema" by Antonio Carlos Jobim
"Round Midnight" by Thelonious Monk
"Body and Soul" by Django Reinhardt
"St. Thomas" by Sonny Rollins
"All of Me" by Wes Montgomery
"Misty" by Johnny Burke and Erroll Garner
These are just a few examples of the many great jazz tracks featuring guitar that are available.
Now, right away I do need to mention the fact that four of those eight recommendations literally do not contain someone playing the guitar, but I have to say, this feels like we are getting somewhere.
Furthermore, when I respond with:
"St. Thomas" by Sonny Rollins doesn't seem to contain anyone playing the guitar.
ChatGPT followed up with:
My apologies, you are correct. "St. Thomas" by Sonny Rollins is a saxophone-led jazz piece and does not feature the guitar. Thank you for bringing this to my attention.
And then when I typed:
"Take the 'A' Train" by Duke Ellington and his Orchestra also does not contain guitar.
I expected a similar response. However, ChatGPT answered:
I apologize for that mistake. "Take the 'A' Train" by Duke Ellington and his Orchestra is a big band jazz composition and does not feature the guitar. Thank you for correcting me. Here's a corrected list of tracks with jazzy guitar:
And for a moment, I was prepared to be blown away yet again by ChatGPT – was it really going to course correct and give me some accurate recommendations? And then, I got the same exact list of tracks, minus the two I called out for not containing guitar. Now, I could get hung up on these inaccuracies, as many people have highlighted incorrect information is an issue with ChatGPT, but instead I find this incredibly exciting. Finally, I am having a conversation with a computer about music. It seems like with the right push, we could have real transformative change here.
Listen, I’ve questioned my own sanity before, most recently when I decided to start a podcast, but I know I’m not out to lunch here. Yes, I love my AirPods Pro and I stream music on the go all of the time, and it works, and I do have some magical moments this way. But I know we can do so much better.
Incidentally, there actually is one AI-powered music tool that launched over 20 years ago, and when it was eventually integrated with the smartphone, it was a massive improvement on human music consumption. It’s called Shazam and it changed my life, because not knowing what song is playing drives me absolutely bananas. I’m impressed to this day, when I lay my iPhone down on a table at a loud restaurant, and Shazam still identifies a song playing under the noise in seconds. Apple wisely purchased the company in 2018.
Now let’s get into some of the new generative AI tools and products currently available that deal with music. This is an extremely crowded space, and seems to change by the day. There’s no way I can mention everything. But the tools I go over should give a decent overview. Most of the tools do what the word “generative” implies: they create music or sounds. They usually focus on one of two potential customers: people who need music for their video content, or people who want to create music.
Here’s a sampling of what’s out there right now for generative music:
Beetoven – or perhaps it is pronounced “Beethoven” – though it is spelled b-e-a-t-o-v-e-n – I can appreciate what they are going for there – is a tool that creates customizable royalty-free music intended for video. It looks like you sort of mix and match from a wide variety of presets, genres, and moods, and then music is generated that you can customize even further. There are other tools called Soundful, Soundraw, and Musico which appear to work in a similar way. In fact, there seem to be new companies and tools of this nature popping up every minute.
There are also text-to-music generative tools, like Riffusion, that accepts prompts like “bongos on a street in Havana,” “grunge but avant garde,” or “angry rap” and the results are.. Interesting. (play music)
And there are tools that create just one instrument, like Emergent Drums, which generates unique drum samples, all royalty-free. The product is actually a plugin that works with a ton of different Digital Audio Workstations, or DAWs, and right now it is invite-only. The cost is $149 but this does get you lifetime access.
There’s some good stuff out there for musicians too. Chord AI listens to a song, and in real-time breaks down what key the piece is in, and the exact chords used. They even have mobile apps.
LALA AI takes audio or video that you upload, and then isolates and extracts vocals, or basically any other accompaniment or instrument you like. Perfect for learning how to play a song along with the official version, without having to compete with the part you are learning.
And of course someone made an AI music “co-writer” for musicians and lyricists – it’s called Staccato and it’s meant to reduce writers’ block and keep the composer in a constant state of creative flow. The musician inputs whatever they are currently composing, and Staccato suggests additional music or lyrics based on that content.
For the music consumer, the interestingly named Maroofy takes one song you enter, and then suggests dozens of similar sounding songs. I guess if you are the type of person that wears out a new track you like by listening to it over and over and over again, this could be helpful. And here’s something that I think many people will find useful. I can’t be the only one that is underwhelmed with the various playlists Apple and Spotify surface and suggest to me. Well, Playlist AI is a free tool and mobile app that creates playlists from text prompts, images, videos, and your most-listened-to music. You can say, "Early 2000's pop music" or "Playing board games on a rainy day" and it will create a playlist for you. Not bad!
Again, this is just a representative sample of the tools currently on the market. And a bunch more probably launched after I recorded this. But you get the idea: lots of “generate royalty-free music for your videos” tools, some cool tools for musicians, and even some things to improve the life of the music consumer.
Thank you as always for listening. If you like what you heard, please consider subscribing, leaving a review, or forwarding to a friend. I’m going to go back to chatting with ChatGPT about music. If I can just teach it a few things and figure out how to hook it into a music service, I’ll finally have the AI musical companion I’ve been dreaming about for years. Bye for now!