the authors said the takedown reflects Nvidia's having "admitted" it trained NeMo on the dataset
Oooh so they're fucked.
THOUGH... let's say I bought a book, why am I not allowed to learn from that book, and write in a similar style to that book. I am? Well why can't I train an AI to use that book and have it write in a similar style? I'm not sold on "I must give you permission to use my book to train an AI." Maybe if I agreed to those terms BEFORE buying the book, but it seems odd that someone can bar me from doing that AFTER buying the book. And just because "we never thought about that" isn't really a good excuse to change the rights for someone who bought the book.
Though if anything this basically proves the old adage. "Don't tell anyone what's in your AI's training data"
Because if a human does it, it's a huge, labour-intensive task, and that difficulty serves as a very effective filter to stop the world being flooded by copied crap derived from the hard work of a person.
If we let AIs do it, we'd be able to churn out so much of it. Just a never ending torrent of machine-generated copycat garbage, constantly being spewed out, flooding the market, stamping out the actual people that wrote the content in the first place.
Funny enough, the printing press also led to changes for a similar reason - there was very little protection in place for writers before the printing press, because copying their work was such a painstaking task that few people bothered and it wasn't much of an issue. Once there was the technology to quickly and trivially rip off their work and print it mass-scale, IP protections were granted to authors.
While I agree that AI generated content is annoying and of poor quality, those are hardly reasons to disallow the creation of AI generated content. If we want to expand IP protections to protect against AI plagerism, we need to draw a better line than just calling it garbage.
The rulers of old were brutal, theocratic tyrants. Forget all that Disney shit. Think witch burnings. Think medieval torture. Of course, they sought to control the printing press. They cracked down on blasphemy and regime criticism. Maybe they only allowed trusted individuals to operate presses. That also funneled money to cronies. Or maybe they even forbade the printing of anything that had not been approved.
Freedom of the press originally means that this is not done anymore.
The first copyright was created over a quarter millennium after the printing press came into use in Europe. It lasted 14 years.
Here's a question: If we were to ban free, open source software, would that protect programmers? Of course not. They have the choice to make their products a gift or not. In the same sense, copyright does not protect authors. Without copyright, there would be only the public domain. People would have a choice to make a gift to the public or not to publish.
Copyright does not protect authors. It was supposed to give the public a tool to support authors. The US Constitution gives that as the only acceptable purpose of copyrights and patents. Unfortunately, current copyright derives from a different source. It was created by the tyrannical empires of Europe in the 19th century. Americans like to blame Disney, but all they did was lobby for the US to adopt it.
Once there was the technology to quickly and trivially rip off their work and print it mass-scale, IP protections were granted to authors.
the statute of ann had nothing to do with protecting authors. it was about which london printers were allowed to print shakespeare's work long after he was dead.
Humans are not generally allowed to do what AI is doing! You talk about copying someone else's "style" because you know that "style" is not protected by copyright, but that is a false equivalence. An AI is not copying "style", but rather every discernible pattern of its input. It is just as likely to copy Walt Disney's drawing style as it is to copy the design of Mickey Mouse. We've seen countless examples of AI's copying characters, verbatim passages of texts and snippets of code. Imagine if a person copied Mickey Mouse's character design and they got sued for copyright infringement. Then they go to court and their defense was that they downloaded copies of the original works without permission and studied them for the sole purpose of imitating them. They would be admitting that every perceived similarity is intentional. Do you think they would not be found guilty of copyright infringement? And AI is this example taken to the extreme. It's not just creating something similar, it is by design trying to maximize the similarity of its output to its training data. It is being the least creative that is mathematically possible. The AI's only trick is that it threw so many stuff into its mixer of training data that you can't generally trace the output to a specific input. But the math is clear. And while its obvious that no sane person will use a copy of Mickey Mouse just because an AI produced it, the same cannot be said for characters of lesser known works, passages from obscure books, and code snippets from small free software projects.
In addition to the above, we allow humans to engage in potentially harmful behavior for various reasons that do not apply to AIs.
"Innocent until proven guilty" is fundamental to our justice systems. The same does not apply to inanimate objects. Eg a firearm is restricted because of the danger it poses even if it has not been used to shoot someone. A person is only liable for the damage they have caused, never their potential to cause it.
We care about peoples' well-being. We would not ban people from enjoying art just because they might copy it because that would be sacrificing too much. However, no harm is done to an AI when it is prevented from being trained, because an AI is not a person with feelings.
Human behavior is complex and hard to control. A person might unintentionally copy protected elements of works when being influenced by them, but that's hard to tell in most cases. An AI has the sole purpose of copying patterns with no other input.
For all of the above reasons, we choose to err on the side of caution when restricting human behavior, but we have no reason to do the same for AIs, or anything inanimate.
In summary, we do not allow humans to do what AIs are doing now and even if we did, that would not be a good argument against AI regulation.
Only the very early version, "Steamboat Willie". And only insofar as it is not a trademark. You can, however, perfectly legally use copyrighted characters. EG The copyrighted Mickey Mouse appears multiple times in South Park. Copyright ideologues always try to make people forget that. They want every human thought and every idea to be owned as "property" and used to extract rent money.