Large language models such as GPT-4 were able to identify people’s personal information by analysing their posts on social media
Large language models (LLMs) like GPT-4 can identify a person’s age, location, gender and income with up to 85 per cent accuracy simply by analysing their posts on social media.
But the AIs also picked up on subtler cues, like location-specific slang, and could estimate a salary range from a user’s profession and location.
It sounds like the reason they used reddit was so they could easily find users who had expressly revealed the information in question, and use it to verify that the AI was accurately deducing the same info from style alone.
They used reddit because it has corraled dumb users. Users a no longer around anywhere else in the Internet, just here on social media. And yes, what better place to find dumb users than on reddit!
Yeah, even if I didn't belong to a local community and a bunch of communities surrounding my profession, the amount of intrigue and fascination emanating from my comments would cause anyone to guess that I'm the Dos Equis guy.
Same. I'm sure I've posted about my location, my job, my race, my history, my real first name, general details of my family makeup etc. I also have a pretty unique name so searching just my first and last name will find stuff about me anyway. I'm even listed by name in books (I was young and dumb and answered some questions about work life).