Fediverse @lemmy.ml Sean Tilley @lemmy.ml 11 mo. ago

Digesting the "Child Safety on Federated Social Media" Report

wedistribute.org Digesting the "Child Safety on Federated Social Media" Report

An in-depth report reveals an ugly truth about isolated, unmoderated parts of the Fediverse. It's a solvable problem, with challenges.

Fediverse @kbin.social deadsuperhero @kbin.social 11 mo. ago

wedistribute.org /2023/08/child-safety-fediverse/

11 comments

Wait until the people who wrote the report learn about the 4chan random board

This isn't a problem with the fediverse. It's a problem with people who are ok with this stuff hosting their own servers. Real cp is a quick torch (or even google because this stuff is on the clearnet too) search away, even before the fediverse

Anyway, just join an instance that blocks this
- I agree that the problem isn't with the Fediverse itself, any more than it is with email, usenet, encrypted messengers, etc.
  
  The thing is, it's a problem that affects the network. While "block and move on" is a reasonable strategy for getting that crap out of your own instance's feeds, the real meat and potatoes of the issue have to do with legal and legislative repercussions. If an admin comes across this stuff, they have a legal obligation to report it, in most jurisdictions. In fact, the EARN IT and STOP CSAM acts that politicians are trying to push through Congress are likely to make companies overreact to any potential penalty that could come from accidental cross-pollination of CSAM between servers.
  
  Unfortunately, this thing becomes a whole lot messier when an instance discovers cached CSAM after the fact. There was a Mastodon instance that was recently taken down without any turnaround time given to the admin to look into it, the hosting company was just ordered to comply with a CSAM request that basically said "This server has child porn on it."
  
  Also, regardless of whether you report it or block it and pretend you never saw anything, that doesn't change the fact that it's still happening. At the very least, having tooling to make the reporting easier would probably be a big boon to knocking those servers off the network.
I wonder what kind of computing resources that Microsoft service needs. Isn't it essentially just a set of hashes? My point being that centralization does not have to be an issue.
- It's a bit of an unknown, since the service is a proprietary black box. With that being said, my guess:
  
  A database with perceptual hash data for volumes and volumes of CSAM.
  
  Means to generate new hashes from media
  
  Infrastructure for adding and auditing more of it
  
  REST API for hash comparisons and reporting
  
  Integration for pushing reports to NCMEC and law enforcement.
  
  None of those things are impossible or out of reach...but, collecting a new database of hashes is challenging. Where do you get it from? How is it stored? Do you allow the public to access the hash data correctly, or do you keep it secret like all the other solutions do?
  
  I'm imagining a solution where servers aggregate all of this data up to a dispatch platform like the one described above, possibly run by a non-profit or NGO, which then dispatches the data to NCMEC directly.
  
  The other thing to keep in mind is that solutions like photoDNA are HUGE. I'm talking like hundreds of thousands of pieces of reported media per year. It's something that would require a lot of uptime, and the ability to handle a significantly high amount of requests on a daily basis.
  
  Thanks for the thought you put into your answer.
  
  I've been thinking: CSAM is just one of the many problems communities face. E.g. Youtube is unable to moderate transphobia properly, which has significant consequences as well.
  
  Let's say we had an ideal federated copy of the existing system. It would still not detect many other types of antisocial behavior. All I'ms saying is that the existing approach by M$ feels a bit like it's based on a moral tunnel vision and trying to solve complex human social issues by using some kind of silver bullet. It lacks nuance. Whereas in fact this is a community management issue.
  
  Honestly I feel it's really a matter of having manageable communities with strong moderation. And the ability to report anonymously, in case one becomes involved in something bad and wants out.
  
  Thoughts?
- IMO the hardest part is the legal side, and in fact I'm not very clear how MS skirted that issue other than through US lax enforcement on corporations. In order to have a db like this one must store stuff that is, ordinarily, illegal to store. Because of the use of imperfect, so-called perceptual hashes, and in case of algorithm updates, I don't think one can get away with simply storing the hash of the file. Some kind of computer vision/AI-ish solution might work out, but I wouldn't want to be the person compiling that training set...
  
  Perhaps the manual reporting tool is enough? Then that content can be forwarded to the central ms service. I wonder if that API can report back to say whether it is positive.
  
  Can you elaborate on the hash problem?
  
  Personally I was thinking of generating a federated set based on user reporting. Perhaps enhanced by checking with the central service as mentioned above. This db can then be synced with trusted instances.
- @xilliah @deadsuperhero Jeddah . . . dears . . . you will learn this one day 🥴

You've viewed 11 comments.