Any word on the final legislation's treatment of free and open source models? At the drafting stage, there were warnings that the requirements would basically shut out FOSS projects, thereby entrenching proprietary models from tech giants. Later on, there was talk about possibly adding carve-outs to protect FOSS, but I couldn't find the details.
There are some carve-outs for FOSS, but. The biggest problem is that the copyright lobby got in a body blow. It won't be enough to make the IP fanatics of lemmy happy, but it'll make some people money.
Model makers must have a policy in place to ensure compliance with the EU copyright directive. That means that websites can use a machine-readable opt-out from AI training. I think this is a big reason why we're now hearing about these deals with reddit and other copyright holders.
Also, model makers must create a summary of the copyrighted training data used. The AI office, which is supposed to enforce this act, is to provide a template for a sufficiently detailed summary. A lot will depend on the AI office and the courts.
A couple problems are obvious. Many enthusiasts will not bother with the paperwork. What will that mean for Hugging Face? Or the AI Horde? The EU is certainly the wrong place to build a business around hosting AI models.
The current open models will likely become "illegal". The makers would have to retroactively provide then necessary documentation, but why would they bother? Even if they did, there is the question about the policy regarding the opt-out. Mind, that doesn't outlaw possession or use. It simply means that businesses may be fined for providing downloads or inference to EU residents.
I think this will likely have a chilling effect on open models. EG Meta could simply say that "Llama 4", if it were to come, is off-limits in the EU. But that might not be enough to indemnify them. Or they could try to comply, which would cost them money for no clear gain. And/or they'd have to leave out data without regard for quality.
Research institutions are not bound by the opt-out. They might become a source of open models.
The carve-outs also do not apply to so-called high-risk AI systems. That would make sense if the act made sense.
LLMs, image diffusion models, and such are termed GPAI (general purpose AI). They are not considered high-risk, by default. They are considered high-risk only once they are adapted to a high-risk purpose. Of course, the line isn't that clear. Say, a teacher could use an LLM to grade tests. Regulators might cause problems there.
Thanks for the information! It's pretty distressing that the EU, in its zeal to do the right thing, seems to be protecting the big AI companies from FOSS competition.
This is one of those cases where no one agrees what "the right thing" is. Owners think it's right that they collect rent from their property. Me, I think the wider interests of society take precedence.
When the copyright directive was passed in 2019, there was a lot of opposition to it. The guy who had a lot of say as a (sorta) committee chair was the same one who now oversaw the AI act. Few people at the time care that it regulated AI training. I think the lobbying came mainly from academics who understood that the oppressive IP laws in many EU countries made ML all but illegal. I'm sure, if the copyright industry had foreseen the importance of the AI training provisions, the situation would be much worse for the EU now.
Unfortunately, the people who might argue for the wider interests of society don't have the wherewithal to meaningfully contribute here. Few people know what AI is, and no one knows what it will be in a few years. There is a lot of rubbish in the act that will do more harm than good, in the name of protecting society. But because it is so ill thought out, I doubt it will do much either way. The copyright fanatics were the real damage dealers.