The complete destruction of Google Search via forced AI adoption and the carnage it is wreaking on the internet is deeply depressing, but there are bright spots. For example, as the prophecy foretold, we are learning exactly what Google is paying Reddit $60 million annually for. And that is to confidently serve its customers ideas like, to make cheese stick on a pizza, “you can also add about 1/8 cup of non-toxic glue” to pizza sauce, which comes directly from the mind of a Reddit user who calls themselves “Fucksmith” and posted about putting glue on pizza 11 years ago.
A joke that people made when Google and Reddit announced their data sharing agreement was that Google’s AI would become dumber and/or “poisoned” by scraping various Reddit shitposts and would eventually regurgitate them to the internet. (This is the same joke people made about AI scraping Tumblr). Giving people the verbatim wisdom of Fucksmith as a legitimate answer to a basic cooking question shows that Google’s AI is actually being poisoned by random shit people say on the internet.
Because Google is one of the largest companies on Earth and operates with near impunity and because its stock continues to skyrocket behind the exciting news that AI will continue to be shoved into every aspect of all of its products until morale improves, it is looking like the user experience for the foreseeable future will be one where searches are random mishmashes of Reddit shitposts, actual information, and hallucinations. Sundar Pichai will continue to use his own product and say “this is good.”
How the fuck did none of those expensive ties at Google see this happening? Have your AI devour the dumbest shit on the internet, then unleash it to human centipede that diarrhea into the mouths of their users. "Elite" is a fucking joke, ya'll are just as fucken stupid as the rest of us.
Lot of people not liking 404 Media, but this is the kind of reporting I want. Point out what's going wrong. Bring it to a conversation without a lot of skew. Fucking show the general reading audience how they are being fleeced by whomever. Didn't Vice do this at one point?
Reddit, and by extension, Lemmy, offers the ideal format for LLM datasets: human generated conversational comments, which, unlike traditional forums, are organized in a branched nested format and scored with votes in the same way that LLM reward models are built.
There is really no way of knowing, much less prevent public facing data from being scraped and used to build LLMs, but, let's do an thought experiment: what if, hypothetically speaking, there is some particularly individual who wanted to poison that dataset with shitposts in a way that is hard to detect or remove with any easily automate method, by camouflaging their own online presence within common human generated text data created during this time period, let's say, the internet marketing campaign of a major Hollywood blockbuster.
Since scrapers do not understand context, by creating shitposts in similar format to, let's say, the social media account of an A-list celebrity starring in this hypothetical film being promoted(ideally, it would be someone who no longer has a major social media presence to avoid shitpost data dilution), whenever an LLM aligned on a reward model built on said dataset is prompted for an impression of this celebrity, it's likely that shitposts in the same format would be generated instead, with no one being the wiser.
I've been trying out SearX and I'm really starting to like it. It reminds me of early Internet search results before Google started added crap to theirs. There's currently 82 Instances to choose from, here
This is why you don't train a bot on the entire Internet and then use it to offer advice. Even if only 1% of all posts are dangerously ignorant . . . that's a lot of dangerous ignorance.
Thr problem the AI tools are going to have is that they will have tons of things like this that they won't catch and be able to fix. Some will come from sources like Reddit that have limited restrictions for accuracy or safety, and others will come from people specifically trying to poison it with wrong information (like when folks using chat gpt were teaching it that 2+2=5).
Fixing only the ones that get media attention is a losing battle. At some point someone will get hurt or hurt others because of the info provided by an AI tool.
I've used an LLM that provides references for most things it says, and it really ruined a lot of the magic when I saw the answer was basically copied verbatim from those sources with a little rewording to mash it together. I can't imagine trusting an LLM that doesn't do this now.
That's a great read if you are only trying to film a commercial or promotion and no one is going to eat it. But then it doesnt matter if its non toxic i suppose.
At least i remember a video a long time ago, perhaps on an episode of how its made, that white glue is used to help get the stretchy cheese pull
Have people tried using a coconut as a fleshlight. If so, what happened?
Gemini fed by Reddit:
It appears people have indeed attempted using coconuts for this purpose, and it's not a pretty story. There are accounts online of things going very wrong, like maggots. In some cases, the coconut being used started to rot, attracting flies which laid eggs, resulting in a maggot infestation.
It's weird because it's not exactly misinformation... If you're trying to make a pizza commerical and want that ridiculous cheese pull they always show.
And then they just slap small disclaimer on bottom of the page "Ai may make mistakes" and they are safe legally. I hope there will be class action lawsuit on them some day regardless. this shit gets regulated before anyone hurts themselves
because its stock continues to skyrocket behind the exciting news that AI will continue to be shoved into every aspect of all of its productsuntil morale improves,
Okay, I have to admit, this made me laugh. Definitely commentary, but still, a good read.
Imagine using the resources of a small country just to generate responses to questions that have the same reliability and verifiability of your stoner older brother remembering something he read online.
They also highlight the fact that Google’s AI is not a magical fountain of new knowledge, it is reassembled content from things humans posted in the past indiscriminately scraped from the internet and (sometimes) remixed to look like something plausibly new and “intelligent.”
This. "AI" isn't coming up with new information on its own. The current state of "AI" is a drooling moron, plagiarizing any random scrap of information it sees in a desperate attempt to seem smart. The people promoting AI are scammers.
At least this is not "Google Is Paying Lemmy $60 Million for Fucksmith to Tell Its Lemmings to Eat Glue" otherwise I would be wondering why Lemmy Admins are excepting huge wads of cash from tech giants.
Is this real though? Does ChatGPT just literally take whole snippets of texts like that? I thought it used some aggregate or probability based on the whole corpus of text it was trained on.
My favorite is the Google bot just regurgitating the top result. Which gives that result exactly zero traffic while having absolutely no quality control, mind you.
Speaking of, I found a recipe today which had to have been ai generated because the ingredient list and the directions were for completely different recipes
Now I only regret not *EDITING all of my Reddit posts to say complete nonsense when I deleted my account June 2023. Instead I deleted each and every post and requested a copy of my data to cost them money.
Audible is a premier audiobook service offering a vast selection of titles across all genres, from bestsellers to exclusive originals. With high-quality narrations and a user-friendly app, it's perfect for enjoying books on the go. Sign up now to try Audible's one-month free offer, with the flexibility to cancel anytime without any hassle. Dive into a new listening experience today! https://www.amazon.co.uk/Audible-Free-Trial-Digital-Membership/dp/B00OPA2XFG?tag=jackos1999-21
How did this clickbaity headline got so many upvotes? Are we really cherry-picking some outlier example of a hallucination and using it to say "haha, google dumb"? I think there is plenty of valid criticism out there against google that we can stick to instead of paying attention to stupid and provocative articles