What are your thoughts on a on-device model that "fact-checks/determines bias" on comments and posts?
Have had a few pet projects in the past around RSS aggregation/news reading, which could fact-check the sources/article while reading, also determining the biases from the grammar and linguistic patterns used by the journalist for the article. Same could be applied to comments.
Wonder if such a feature had value for a reader app for Lemmy? I feel a definitive score is toxic. But, if it were to simply display the variables to look out for it can help make a objective decision yourself?
Another application of this, is also pulling just the objective statements in the articles for faster reading.
Apart from what @AlataOrange has said, I think your "on-device model" would die of overload in its first 5 minutes of operation. Most comments are biased. Everyone has an agenda, whether they are conscious of it or not. If I want factual things, I'll read the factual things elsewhere on the internet. If I want some buttery popcorn, I'll microwave some and read the comments.
I guess it would only help for reading articles if anything. Or a comment that has a tone for informing such as "Actually, this is so and so because of so and so". But, I see your point.
So, your software would go to the link provided (if there's a link provided) and scan the text of the article for language that sounds biased. This is an interesting exercise in computer programming, but it wouldn't be useful. Imagine the biased reaction of the user that wants or does not want the article to be judged "biased" by a computer program. I could just hear people muttering to themselves, "damn algorithm." This is something software is getting better at, but it's still not reliable. Take, for example, some software from my field: The kind that detects plagiarism. When I get student papers, I have to scan them through the plagiarism detector. After that, I have to inspect the ones that were flagged as "potential plagiarism." I've had to use this type of software for over a decade, and it's still problematic. I've had situations in which I found the plagiarism and the software did not. I've had countless situations in which the software found plagiarism but there was no plagiarism. So, I don't know, your goals as a computer scientist are lofty. Still, I want you to keep your bias detecting software away from my reading in my day to day. Anyway, human beings either have the reading skills and knowledge about where to get the facts from or they do not. If they are ignorant enough to require a computer program to judge for them, they will question the software's judgment, anyway, whether it's right or wrong. Why? Everybody's got an agenda.
I guess it's less of the standard "AI" that you may think of that simply just thinks of something and outputs something. But, has multiple preprocessing steps prior to detection and then post. So for instance, parsing an article by its sentences and analyzing the subjective statements such as "I feel great about XYZ", would be flagged, while searching for statements that either back up such Claims with Data. Such as in the standard format of "Claim, Lead-in, Data, Warrant" in writing for example. Then, checking the data source recursively until it finds it is infact valid. Now this "validity" is threatening, because yeah that can be controlled. But, there can definitely be transparent and community led approaches to adjust what source is considered valid. Without resources an initial solution would be, creating a Person graph of these sources authors and/or mapping against a database of verifiable research repos such as JSTOR, finding linked papers mentioning the same anecdotes, or simply following a trail of links, until the link hit's a trusted domain.
Then there is also the variable if all the sources were heavily weighted onto one side of the equation, where the topic can clearly have valid devil advocates/arguments. This is where bias can come in. Post processing would be finding possible "anti-arguments" to the claims and warrants (if available in there store of verifiable sources). The point is not to force a point, but to open the reader's paradigm
I see how using "fact-checking" in my OP was pretty negative/controversial. But, there's no sense of control of what is "morally right" or what is the "Capital T truth" trying to be imposed on my part as a computer scientist. I strongly agree that computer ethics need to be a focus. Seeing your perspective was a great take to keep in mind. But, the passion is mostly driven by the black-and-white culture of online opinions, hence your point about agenda.
That last paragraph there is where you've exposed your whole issue. If you would just please repeat any philosophy classes you didn't pay attention in, the "Capital T truth" has been examined since Socrates. What is this "Capital T truth" you are looking for in journalism, or even in any text? You seem to think it exists. How do you know it exists. I honestly think a small crash course in philosophy will help you sort out what your goals in creating this product might be. Because, yeah, you're looking at it as a product. And you feel like there is only one "Capital T truth." Socrates was searching for it. Plato, his student, wrote about it. Everybody likes it. Now you're saying that a computer program can just cut through all the red tape of bullshit and find the "Capital T truth" just like that. Tell us what the real truth is. Come on. Get yourself onto other product ideas. The truth is out there, dude. Don't you know any X-Files? Or is your name HAL? LOL
I said I don't... And I said it's not to find it, but to essentially provide the reader with the data points to do so on their own. Like I said in the OP: I feel a definitive score is toxic. But, if it were to simply display the variables to look out for it can help make a objective decision yourself
OK, I understand you. Whether you say you don't or you do is sort of irrelevant. See what I mean? Pay attention in the philosophy classes. Figure out what the truth is for you and leave that shit alone. On to the next programming project! I'm sure you can program some really cool shit.
Sure, I will. But, I will wait for more perspectives before I move onto the next. It would be a major mistake to continue on this alone. the idea is to have a team to compensate for flaws that you are potentially observing.
Anyways, I'd like to say we are kind of agreeing. Not sure what caused that aggression. I do think of things in a product sense, but that is the byproduct (no pun intended) of my learning environment. If we are talking about philosophy, I should definitely read up some more. But, the capital T truth understandings majorly came from my observations of David Foster Wallace's book "This is Water". I will expand on it and circle back to improve my writing so it communicates my thoughts better.
No aggression. I'm just being sort of aggressively nice about telling you that your software idea is neat but not really a good idea. You are the asker, by the way. I'm just being generous and answering. No hard feelings.
Figure out what the truth is for you and leave that shit alone.
This actually got me thinking quite a bit and was hoping you'd expand on it. Is it more directed to building things that are not driven by a personal truth?
It's more about how you can create a computer program to see an absolute truth when that absolute truth is called into question. The limits of your computer program are limited by you and your mind. That's why, in the end, people are trying to create artificial intelligence and they're not getting it right and perhaps they won't. Your idea of truth may work for you in your life, personally. Does it work for everyone else in the world? Again, philosophy is a nice thing to learn about. It really helps. Cheap science fiction movies also help, they sort of always get at what I'm saying somehow. On the bright side, artificial intelligence is quite capable of making itself smart and then stupid, according to the headlines. So, it is capable of making itself dumber. Why is that? What truth did it find? What does it know that we do not? That's the problem, you see. Perhaps it knows nothing. Or perhaps it doesn't care. Or perhaps it sees that a simple math problem can be answered wrong and it doesn't care. That's where we are with this type of thing that you want to create that reads for us. I can find plagiarism but the software can't. The software sometimes can find plagiarism, and most of the time it's wrong. That's where we are. If you can find a way to make it better, that's great. But don't make it about computers reading things in place of humans. That's not possible right now. In the future, maybe humans and software will be able to read things similarly. In the place of? I don't know, man. My next door neighbor can read. Do I want my next door neighbor reading things for me? No, not really. I barely know my next door neighbor. I prefer to read the things myself.
It's not a bad idea. It just needs fine tuning, like, think about the plagiarism software I'm talking about. It's utterly useless most of the time, but it isn't a bad thing. At the same time, it is indeed a bad thing. Think about what would happen to my students if I was lazy. I'd just not read the papers that were flagged as plagiarized and have them expelled from the university, not bothering reading the papers at all, saving myself some work and screwing over people in my class. I could - and actually can - use the software results of the plagiarism detector as my evidence and have the student thrown out and not have to read any of their writing. It would save me so much time. I could have more time for me. I could drink too much. Try out some new fancy illegal drugs. Party at the club. Lose my job.