“The basic point is that [the AI companies’] model requires a vast corpus of sound recordings in order to output synthetic music files that are convincing imitations of human music,” the suits alleged. “Because of their sheer popularity and exposure, the Copyrighted Recordings had to be included within Suno’s training data for Suno’s model to be successful at creating the desired human-sounding outputs.”
Nope, there's plenty of other ways for an AI to have created similar notes. Say you have Song A written by Steve. Steve grew up listening to a lot of John, who wrote songs B through Z. Steve spent his childhood listening to and being influenced by John, so when Steve eventually grows up to write Song A, it's incredibly possible for it to contain elements from songs B through Z. So if an AI trains off of Steve it's going to consequently pick up whatever habits Steve learned from John.
Just like how you picked up some habits from your parents, which they picked up from their parents... etc. You could develop a habit that started with an ancestor you've never met; who are you copying?
Of course there are other ways to create similar notes.
But now the AI developers will have to testify under oath that they did not use Johnny B Goode, and identify the soundalike song they used that is not among the millions of other IPs held by the RIAA.
I feel that this logic follows a common misconception of generative AI. Its output isn't made from the training data. It will take inspiration from it, but it doesn't just mix-and-match samples from the training materials. GenAI uses metadata that it builds based on that training data, but the data, itself, isn't directly referenced during generation.
The way AI generates content isn't like when Vanilla Ice sampled Under Pressure; it would be more like if Vanilla Ice had talent and could actually write music, and had accidentally written the same bass line without ever hearing Queen. While unlikely, it's still possible, and I'm sure we've all experienced a similar situation; ie. you open a comment thread to post a joke based on the headline and see the top comment is already the exact same joke you were going to make... You didn't copy the other user, and they didn't copy you, but you both likely share a similar experience that trigger the same associations.
For the same reasons that two different writers can accidentally tell the same story, or two different comedians can write the same joke, two different musicians can write the same melodies if they have shared inspirations. In all of those instances, both parties can create entirely original materials own their own accord, even if they aren't meaningfully unique from each other. The way generative AI works isn't significantly different, which is why this is such a legally-murky situation. If generative AI were more rudimentary and was actually sampling the training data, it would be an open-and-shut copyright infringement case. But, because the materials the AI produces are original creations of its own, we get into this situation where we have to argue over where to draw the line between "inspiration" and "replication".
IMO music copyright has gone too far. Have a look at the Creep - Radiohead chord progression of I, III, IV, and iV.
There have been a few lawsuits over using the same chord progression, but with music theory there are only so many permutations before you end up arriving at the same logical places. The same is quite true for overlaying melodies to this particular progression.
You’re equating a human listening, learning, and getting inspired by other sources with what an LLM does as part of the model building. I don’t think the two are the same. Look at copyright law as it sits right now with AI not being able to hold copyrights.
We can’t treat AI with the same legal protections as humans.
I don't even know why you'd have to sample other songs to generate good music from an algorithm. Just have it randomly play "the four chords" and it is guaranteed to make something that sounds good.