Ok.. so from now on .. when I see a "repackaged" Microsoft product that for some reason.. which I don't care to know... doesn't ask for a payment.. I can use it without restrictions ?!! that's really nice of you Microsoft ... thank you.
Which is why I boycott as hard as I can every service this evil corporation provides (migrate your MS GitHub project away now so I can delete this account too)
Even my coworkers who are complete idiots with technology, who actively sabotage themselves every time they touch any piece of hardware and software, have soured entirely on nearly every Microsoft product across the board.
Its funny how quickly people change their minds when they dont understand the technology on a deeper level. Its just: "this is frustrating now I hate it" and no further thought.
No officer, this is not a pirated movie. It's generated by an AI model I created and trained with data from the internet and the fact that it's 99% identical to an existing movie is irrelevant.
Also, in 2022, several unidentified developers sued OpenAI and GitHub based on claims that the organizations used publicly posted programming code to train generative models in violation of software licensing terms
They can argue about it not being a copy all they want. If there is a single GPL licenced line of code scraped then anything they produce is a derivative work & must be licenced GPL.
And if so, I assume “derivative” will still subjective to some degree. Where do we draw the line between derivative and non-derivative?
I’m torn about my personal opinion about copyrights and software licensing in general. I think the main problem is the huge power imbalance between people and corporations, not so much the fact a company analyzed a bunch of available data to solve programming problems.
They don’t copy the data and sell it verbatim to others which would be a legal issue and in my mind also a moral issue, as they don’t add any additional value.
2: Normally derivative works are patched or modified versions of the original. I think the common English meaning would apply & chatGPT et al are fucked. I doubt there is a precedent for this yet.
The only way I can see them weaseling out of this is by keeping the program running the model made in-house and proprietary while releasing the model in a format unusable without the base (proprietary) program. But maybe the GPL forbids such obfuscstion efforts (I don't know, I haven't studied it in detail)
Yeah, but anything you create automatically has a copyright, so for example this comment is not in the public domain. Its use is limited to the context I am using it in; that is, I expect it to be copied for federation purposes, but I wouldn't say that AI is covered in this context, just genuine readership, moderation, and bots that are 'part of the community'.
At least that's the EU stance afaik. Like if I saw this comment on a billboard somewhere I'd see that as a clear breach of copyright and even privacy.
I'm fine with that, but let's put some rules against this.
Any AI models should be able to determine the source of their data to a defined level of accuracy.
There should be a well-defined way to block data from being used by AI. If one of these ways (e.g. robots.txt) has been breached, the model has to be rebuilt without the data, and reparations made to the content owners.
A neural network is basically nothing more than a set of weights. If one word makes a weight go up by 0.0001 and then another word makes it go down by 0.0001, and you do that billions of times for billions of weights, how do you determine what in the data created those weights? Every single thing that's in the training data had some kind of effect on everything else.
It's like combining billions of buckets of water together in a pool and then taking out 1 cup from that and trying to figure out which buckets contributed to that cup. It doesn't make any sense.
Respectfully, I worked for Alexa AI on compositional ML, and we were largely able to do exactly this with customer utterances, so to say it is impossible is simply not true. Many companies have to have some degree of ability to remove troublesome data, and while tracing data inside a model is rather difficult (historically it would be done during the building of datasets or measured at evaluation time) it's definitely something that most big tech companies will do.
It’s not impossible lol. All a company would need to do is keep track of where they were getting content. If I use a script to download as much of the internet as possible and end up with a bunch of copyrighted content I could still get in trouble, hell there was even a guy arrested for downloading jstor without authorization.. Stop letting these guys get away with crimes just because you like the idea of the end product
Sure thing...now GPL/Creative Commons all your code involved in any way for your models, documentation, parameters, data sets, and allow full unlimited integration and modification by any parties to any portion of it.
Man it's crazy how these fuckers basically get to ignore copyright law whenever it's inconvenient to them but if you have one too many Windows machines provisioned they'll send the Spanish Inquisition after you.
The social contract? Tf. The social contract still required attribution in almost all cases for creative work unless explicitlf stated otherwise—especially in the case of comercial products like ChatGPT—so I don’t know where this joker is getting his ideas.
I went into a smidge more detail over on my Mastodon last night, but my response is summed up as “WTAF? No! Freeware is an explicit license, as anyone from the BBS days will recall.”
Would you mind sharing a link to it here if it's not any trouble? (Or your handle if that's easier for you) I'm always looking for new stuff to check out and new people to follow on Mastodon