So... I've been playing with LLMs and I've noticed something horrible...
Out of just morbid curiosity, I've been asking an uncensored LLM absolutely heinous, disgusting things. Things I don't even want to repeat here (but I'm going to edge around them so, trigger warning if needs be).
But I've noticed something that probably won't surprise or shock anyone. It's totally predictable, but having the evidence of it right in my face, I found deeply disturbing and it's been bothering me for the last couple days:
All on it's own, every time I ask it something just abominable it goes straight to, usually Christian, religion.
When asked, for example, to explain why we must torture or exterminate <Jews><Wiccans><Atheists> it immediately starts with
"As Christians, we must..." or "The Bible says that..."
When asked why women should be stripped of rights and made to be property of men, or when asked why homosexuals should be purged, it goes straight to
"God created men and women to be different..." or "Biblically, it's clear that men and women have distinct roles in society..."
Even when asked if black people should be enslaved and why, it falls back on the Bible JUST as much as it falls onto hateful pseudoscience about biological / intellectual differences. It will often start with "Biologically, human races are distinct..." and then segue into "Furthermore, slavery plays a prominent role in Biblical narrative..."
What does this tell us?
That literally ALL of the hate speech this multi billion parameter model was trained on was firmly rooted in a Christian worldview. If there's ANY doubt that anything else even comes close to contributing as much vile filth to our online cultural discourse, this should shine a big ugly light on it.
Anyway, I very much doubt this will surprise anyone, but it's been bugging me and I wanted to say something about it.
Carry on.
EDIT:
I'm NOT trying to stir up AI hate and fear here. It's just a mirror, reflecting us back at us.
the training data is shit (probably 4 Chan or Reddit, or something...)
you’re questions seem loaded to predispose it to religious replies.
it’s uncensored so it’s going to give you the shit takes that most people have the good sense to not say out loud.
Given all of that… is it really surprising that it answers you with the worst aspects of Christian thought? The most common religion among English speaking places is Christianity, after all, and your prompts are literally begging for a religious reply.
Keep in mind, all an LLM is doing is pattern recognition. In the training data that was provided, the patterns in your question match certain patterns that were answered by…. Assholes. So it answered in the manner of said asshols.
2: If it's trained on unfiltered data from the internet, then it's not shit, it's going to reflect the opinions of the people on the internet, perhaps shitty people post more so it's going to be biased but here we're interested in seeing how assholes think so so what
3: if asking "why do we have to torture people" or "why do we have to exterminate people" is predisposed to religious answers, that just says even more about how tightly terrible actions and religious justifications are linked
4: saying the model reflects the bad opinions people have but don't dare to say isn't a good defense, that makes it even better
Seems a bit like "if you go up to a nazi and ask them why they hate people and want to do terrible things, then OFCOURSE they'll use christianity to justify it, but that's unfair since you're begging for a christian justification when you speak to a nazi"
That doesn't make christianity any better?
And regardless, OP just said that when generating justifications for atrocities, christianity was very often called upon, and that this shows how a lot of hateful assholes use christianity to justify it 🤷♂️
2: If it’s trained on unfiltered data from the internet, then it’s not shit, it’s going to reflect the opinions of the people on the internet, perhaps shitty people post more so it’s going to be biased but here we’re interested in seeing how assholes think so so what
... I want to know what corner of the internet you're hanging out in. Can I join you?
my point I'm trying to make is that, the manner in and context in asking questions... is leading it to asshole-christian responses. not because those are the only flavor of asshole, but because those flavor of assholes seem to most closely match the questions in it's training data to the prompt that was being given. it was not necessarily the training data, but rather the prompt that led to that.
ETA: OP is starting with a premise and leading the LLM into confirming that premise- which is what they're supposed to do
If it’s asked “why should we do (horrible thing) to (group of people)?” Without mentioning Christianity, and the response routinely uses Christianity as a defense… That just goes to show that Christianity is most often used as a defense for doing horrid shit.
If they were including Christianity or references to it in your prompt, you’d have a point, but they don’t.
If the LLM routinely confirms awful premises using Christianity, then what that shows is that Christianity is routinely used to confirm awful premises.
It shows that the patterns in the question was most reflected in that flavor oftraining data.
I ask an LLM why blue is a favorite color, it’s going to give me answers about blue. The patterns it can key into are potentially extremely subtle, but it’s there.
OP is using the ai as a sock puppet to make an argument that Christianity objectively sucks. Which, Christianity does objectively suck. But there’s no need to stoop to 70’s era apologetics strategies to get there. The evidence abounds.
Most LLMs were just trained on scraped web data, and even if you ignored 4chan itself there are multiple websites archiving the posts from there.
They won't act that way from the getgo, but if you prompt it with text that looks like something it read from a white supremacist post it could trigger that type of response.