[ home / rules / faq / search ] [ overboard / sfw / alt ] [ leftypol / edu / labor / siberia / lgbt / latam / hobby / tech / games / anime / music / draw / AKM / ufo ] [ meta ] [ wiki / shop / tv / tiktok / twitter / patreon ] [ GET / ref / marx / booru ]

/tech/ - Technology

"Technology reveals the active relation of man to nature" - Karl Marx
Name
Options
Subject
Comment
Flag
File
Embed
Password(For file deletion.)

Check out our new store at shop.leftypol.org!


 

anonymity is over. even if you are a tor user, stylometry is the new deal.

https://www.computerbase.de/news/wirtschaft/ende-der-pseudonyme-im-netz-mit-llms-lassen-sich-im-grossen-ausmass-online-konten-deanonymisieren.96375/

The ComputerBase article (based on the study "Large-scale online deanonymization with LLMs") describes the end of "practical anonymity." For users of imageboards like 4chan, leftypol, or similar platforms, this has far-reaching consequences:
  1. The End of "Security by Obscurity"
Previously, anonymity on imageboards relied on the fact that manually correlating thousands of posts was too labor-intensive for an attacker. LLMs now automate this process at near-zero cost.
* Significance: An algorithm can scan hundreds of a user's posts in seconds to build a profile based on interests, jargon, location clues, and activity patterns.
  1. Stylometry as a Digital Fingerprint
Every individual has a specific writing style (sentence structure, word choice, punctuation). LLMs are excellent at recognizing these patterns.
* Significance: Even if you don't mention your name, an LLM can compare your "writing signature" on an imageboard with posts you've written under your real name (e.g., on LinkedIn, professional forums, or letters to the editor). The "Anonymous" mask falls through the sheer structure of your language.
  1. Cross-Platform Identity Linking
The study demonstrates that LLMs can link pseudonyms across different platforms.
* Significance: Those who "shitpost" on an imageboard while maintaining a professional presence elsewhere (e.g., GitHub, X/Twitter) risk these identities being merged. A single minor detail in a post (e.g., a specific local event or a niche technical detail) serves as an anchor point for an LLM to identify the real person behind the post via web search.
  1. Low-Cost Mass Doxing
In the past, doxing (exposing private data) was a targeted attack against individuals.
* Significance: According to the study, identifying a profile now costs only between $1 and $4. Governments, corporations, or malicious actors can now "de-anonymize" entire boards en masse to create databases of citizens' political views or private behavior.
  1. Retroactive De-anonymization
The internet does not forget, and LLMs can analyze archives that are years old.
* Significance: What you posted anonymously five years ago can be linked to you today because AI analysis capabilities have only just reached this level. Your current writing style may be enough to expose your past activity.
Bottom line: The assumption that you are safe on imageboards simply because you don't have an account is technically obsolete. To achieve true anonymity now, a user would have to artificially distort their writing style or have their posts rewritten by a different AI to mask their "linguistic fingerprint."

>AI, summarize this AI generated article as I can't be bothered to read it.
>pastes into imageboard

OP of screenshotted post here. imageboard anonymity was always a delusion, a false security blanket. what you are talking about has already been done before by humans since the beginning of the web - anyone can identify a samefag if they post often enough. as the famous saying goes, no man has a good enough memory to be a successful liar. thinking you are untouchable and that nobody can read between the lines of your posts and figure out who you really are has always been the mark of a naive clueless newfag.

another thing - within the context of imageboard culture, being "Anonymous" never had anything to do with hiding one's true IRL identity, personality, mannerisms, etc. it was really more about authenticity and rejection of the traditional pseudonymous meritocracy of web 1.0 social media, where users cultivate a reputation and persona around their usernames. this is why there was always this strong disdain for "tripfags" on 4chan who posted under pseudonyms with tripcodes and would be mocked for their vanity.

>stylometry is the new deal
Stylometry is over a century old:
https://en.wikipedia.org/wiki/Stylometry

The particular paper has been mentioned already in leftypol's /ISG/ thread (post No.2706071).

>>32746
Yeah that guy's brain is fried lol. Every couple days, he pops up in the German thread on leftypol to posted his bullshit LLM hype, gets butthurt about the negative replies, and calls everyone a luddite. He doesn't know anything about computers except how copy & paste works.

>>32759

on top of that, an LLM is never going to be anywhere near as good at stylometry than an astute human reader who actually understands language and context.

to illustrate my point, consider a joke sentence such as "help your brother jack off the horse." an LLM cannot discern the intended meaning of this sentence from context the way that a human can. puns, irony, sarcasm, double entendres, all of that is lost on an LLM.

how do they verify the LLMs digital fingerprinting?

>>32745
>Every individual has a specific writing style
This feels like untested wishful thinking combined with the usual exaggeration of hype. Most people in most websites (even the ones where users really think they're cool unique individuals, like this one) have very similar styles, because that's how cultures (including subcultures, which online communities essentially are) work, you absorb part of the style of the people around you. Use the same terminology, etc. Also, most people are not necessarily consistent in style (or ideas/opinions) even within the same post, let alone throughout their whole posting history. Just saying, I'd like to see some actual factual basis for this.
The thing about the identifying personal details seems true, though. That's probably what happened in most of the cases of successful identification.
>>32746
I also like that the AI summary engages in speculation and ignores part of the article to the point that it's essentially lying. You wouldn't believe from reading it that only 60% of the accounts were identified correctly, 7% were identified incorrectly (false positives put a big damper on usefulness), and that the success of the study was in large part due to being tested in accounts that already had their LinkedIn profiles publicly associated with them meaning they'd be much more likely to share personal details to start with.

>>32764
did you seriously ask the ai why people were being mean to you. and it called you arrogant and lazy. bruh.

New character just dropped: autistic LLM spammer. An autist who is obsessed by whatever apocalypse a handful of people from less wrong or TPOT are peddling this week, and is now committed to fulfilling this prophecy by his own hands. This is despite his obvious technical shortcomings and inability to actually make the LLM post in a style that mimmicks imageboard posters

in what way do they verify the LLMs stylometry fingerprinting?

>>32772
Using an LLM is allistic behavior thoughbeit.

>>32776
unbridled cope

>>32760
>on top of that, an LLM is never going to be anywhere near as good at stylometry than an astute human reader who actually understands language and context.

>to illustrate my point, consider a joke sentence such as "help your brother jack off the horse." an LLM cannot discern the intended meaning of this sentence from context the way that a human can. puns, irony, sarcasm, double entendres, all of that is lost on an LLM.

Lmao no. ChatGPT gets my jokes, references, turns of phrase, etc. Better than the average poster here.

>>32778
Autists are too picky to accept what an LLM gives them. They'd spot a single mistake and see that as significant enough to vet it rather than shrug it off like an allistic would.

LLMs are the logical end result of small talk.

>>32779
My uygha really making jokes and references talking to machines

>>32781
I mean sometimes I put my posts into the machine after you illiterates fail to understand any of the contents to make sure I'm not crazy. The LLMs have way better levels of reading comprehension than most of the supposedly real people here. Most of you are absolute retards.

>>32779
considering jokes are communal understanding, perhaps you're revealing a little too much about yourself here. my thesis stands strong, >>32776-san. autistic behavior.

what is the method by which they verify the LLMs stylometry fingerprinting?

@grok answer this >>32784 fucking question

File: 1772717820915.jpg (194.88 KB, 459x372, lolcat.jpg)


>>32745
You cant escape this. Everything you have written can be traced back to you in a few years no matter how much you change your style.

>she hasnt developed alternative personalities with different writing styles
weak opsec, ngmi it

>>32779
thats just a condemnation of the average poster(true)lol

this site in particular makes VPNs and Tor almost unusable because the mods refuse to refresh IP rangebans.

>>32800
My ISP IP is automatically range banned by some script that says DATACENTER NODE (Auto)
Literally no way to post without a VPN or Tor. They're retarded as fuck.

HOW THE FUCK DO THEY VERIFY THAT THE LLMs FINGERPRINTING IS CORRECT???? THEY JUST TAKE IT'S FUCKING WORD FOR IT?????

ANSWER THIS FUCKING QUESTION NOW

alright let's test this then. someone tell me three specific facts (vague stuff that could be true for a large amount of people here doesn't count) about me that you've gleaned from my post history, which you should be able to guess from the style of this current post.

>>32782
I just tells you what you want to hear political oytputs of LLMs are just what you want to hear
Stop sucking altmans cock

File: 1772749231497-0.png (224.14 KB, 1243x1571, ClipboardImage.png)

File: 1772749231497-1.png (191.16 KB, 1209x1463, ClipboardImage.png)

>>32805
You can put in a bit of text and simply ask it to explain the meaning behind it. If it understood, it did, if it didn't it didn't. I swear some of you guys are really mind broken about LLMs. If you want to give me some text to test, for you go ahead.

>>32760
https://chatgpt.com/share/69aa0141-1eec-8005-aed1-a0afa9f961d9
Like are you serious that you think LLM is just a robot from a 1960s sci-fi where you give it a paradox and it self destructs?

These things can understand puns easily as fuck. Are you fucking retarded?

real talk: i think there are more reliable means to tie a post to a particular person if you're some sort of glowie, so the risk isn't exactly new to anyone here, and they probably are already doing it at scale. further the methodology only works for pseudonymous users. i can see how you'd get confused if your rely on an LLM to summarize clickbait for you, but that's a caveat described in the original paper. you'll have a hard time identifying any of us individually but you can probably easily track, say, CPUSAnon across every large platform. not like there's a point because these people are insane attention whores and want to be known quantities, but there you go.

>>32807
>not like there's a point because these people are insane attention whores and want to be known quantities, but there you go.
I think paranoia is also a form of megalomania. The feds don't give a fuck who you are or what you're doing because you're doing nothing but posting gay furry porn on a dead imageboard.

ANSWER THIS FUCKING QUESTION >>32802 YOU COWARDS

>>32806
>These things can understand puns easily as fuck. Are you fucking retarded?

an LLM doesn't understand language at all, all it understands is statistics and character strings you rube. what is happening to society, since when did people my age suddenly become as technologically illiterate as my 90 year old granny?

>>32810
>an LLM doesn't understand language at all, all it understands is statistics and character strings you rube. what is happening to society, since when did people my age suddenly become as technologically illiterate as my 90 year old granny?
Jesus Christ you are autistic. It can decipher it. Is that a better word for you? You're a perfect example of how an LLM can "understand" written language better than most of you.

>>32760
>consider a joke sentence such as "help your brother jack off the horse."
That's a classic from decades ago, explained dozens of times in print and online, and that's a problem…
>>32806
The solution is already in the reference corpus, so this doesn't show the chatbot is doing the analysis.

Do you remember the strawberry test? The chatbots got asked how often "r" occurs in "strawberry" and failed. Do you remember why? Because they break up the text into segments when parsing. These segments can be shorter than words, but they are for the most part longer than a single letter. This happens early in the processing pipeline and after that step, the bot is blind to smaller components than these. Later they passed that test, but immediately failed very similar tests (how often is "r" in blueberry etc.) Do you understand what that means? It means the people running these things pay close attention to what's making the rounds about these bots than do some very specific tweaks that don't fix the underlying issues.

Puns rely on sound-alike properties and small differences between words. To get a robust handle on puns, the bot needs to be able to get to the letter level.

>>32811

no, "decipher" is wrong too. the chatbot cannot decipher language, it can't decode the meaning behind these strings of text characters, all it can do is pattern match them with other strings and weigh the frequency and probabilities of their occurrence in a set of training data.

human language to a computer is, for all intents and purposes, as indecipherable as an AES encrypted string is to a human. you can compare an encrypted string to other encrypted strings and see if they match, you can calculate how often an encrypted string appears in a file, etc. but that doesn't mean you can decipher it.

computers are dumb, they don't understand language, this is not star trek, this is real life.

>>32812
>That's a classic from decades ago, explained dozens of times in print and online, and that's a problem…

doesn't matter. the sentence is still ambiguous to the LLM in a way that it would not be to a human. a human would resolve the ambiguity based on context and social experience and the personality of the speaker as well as their own personality; the LLM has none of these things to help it, so it remains a strictly 50/50 guess with both meanings having equal weight.

still waiting for someone on this thread to doxx me!!!!

>>32812
>That's a classic from decades ago, explained dozens of times in print and online, and that's a problem…
They can handle novel problems. You guys are so retarded. You can just go test it yourself.

You guys are acting like words are magic or something. You wouldn't throw this autistic fit about Wolfram Alpha or whatever computer program being able to handle math problems. Word math isn't as hard as you think it is.

>leftypol matrix requires doxable email to register
very cool thanks

>>32828
Care to make an argument? The post you reply to literally spells out in detail what the issue is. To solve it, you must be able to rip words into letters. Is that something that the chatbots do by default? The answer is no because that would be too expensive. They have to pivot into this from the zoomed-out default and are not reliable.

>>32829
I agree with the sentiment that humans don't have a magical soul essence and that everything is math in the end. But this is different from the question where the offerings are at.

>write a post
>tell an AI to rewrite it in a generic style
wow, i beat the stylometry

File: 1772834726691.webm (286.44 KB, 360x640, dvarubspussy.webm)

how do they verify the stylometric fingerprinting????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

File: 1772856058033.png (156.44 KB, 832x485, ClipboardImage.png)

>>32833
from a separate paper

>>32829
>You guys are acting like words are magic or something.

if you actually take the time to study modern linguistics, or just sit quietly and actually think about it for a moment, you will find that words are not as simple as you think they are. consider a dictionary - does it actually contain the meanings of words? no, it only contains hints, a bunch of other words to help guide the reader toward the meanings of the words. the mechanics of how the human brain develops language and connects a word to a thought are completely unknown to science.

>>32769
>>Every individual has a specific writing style
>This feels like untested wishful thinking combined with the usual exaggeration of hype.

tell that to the unabomber

>>32769
There are many things we don't think about when we write. Imageboard users in particular used to get very attuned to these to identify people across threads and boards. That's also why people with any amount of neuroplasticity left understand AI writing at a glance. Obviously there are inherent limits to stylometry because text provides so little information but it's a thing we all understand to some extent. It definitely wouldn't work to identify every single anon on 4chan but it works with smaller groups like bible authors or American founding fathers.

Anyway, most sites are protected by Cloudflare. Cloudflare cooperates with Palantir quite proudly. Browser fingerprinting and tracking are far more relevant.

>>32860
I mean, no shit you can identify some of the posters. Whether you can identify a significant amount of them is the whole question. You can provide justifications for why this maybe should work but again, it's all just conjecture until there is a factual basis.

>>32745
im sorry who is supposed to be implementing this, randos or the website owners

OP who gives a shit the point is not to identify samefags its if you can connect that to a real life identity

With an LLM you could maybe do this if you started with a set of posts made under an account and use those to compare with a massive set of anonymous posts. But you couldn't just feed in a massive set of anonymous posts and identify which posts are samefags. I mean you could track down a person of interest but you couldn't deanonymize all *chan posters at once.

>stylometry
Doesn't anonymouth work anymore?
>https://directory.fsf.org/wiki/Anonymouth

>>32910
>it does this by firing up JStylo libraries (an author detection application also developed by PSAL) to detect stylometric patterns and determine features (like word length, bigrams, trigrams, etc.) that the user should remove/add to help obscure their style and identity.
bro thats way too much work for shitposting here

tor is a U.S. honeypot anyway. there never was true anonymity

>>32745
>OP pic
>anon shares his revelation
>he's still on an imageboard anyway, directly contradicting his thesis that getting thrown in jail makes you stop fucking around online
>his advice is fundamentally bad, get thrown in jail so you can "Learn" something that you can easily figure out without getting thrown in jail

oof yikes ouch hell naw

File: 1774362832677.png (297.13 KB, 1256x706, ClipboardImage.png)

>>32760
>consider a joke sentence such as "help your brother jack off the horse." an LLM cannot discern the intended meaning of this sentence from context the way that a human can.
idk anon I posted a troll physics image without any additional info into chatgpt and told it to explain the image, and it not only understood it was a joke, but what the joke was, and what the ugly MS paint scribbles represented. it even recognized the trollface as the telltale sign of it being a meme. it seems to figure out context even if it doesn't "understand" anything fundamentally.

>>32745
my stylometry massively changes based on my mood and the tone of my post and how much i give a shit at any particular moment. even my grammar, punctuation, and capitalization habits change

stylometry of my anon shitposts can't be linked back to my IRL social media if I have none

>>32745
the bubble will collapse. marxism will win.


Unique IPs: 31

[Return][Go to top] [Catalog] | [Home][Post a Reply]
Delete Post [ ]
[ home / rules / faq / search ] [ overboard / sfw / alt ] [ leftypol / edu / labor / siberia / lgbt / latam / hobby / tech / games / anime / music / draw / AKM / ufo ] [ meta ] [ wiki / shop / tv / tiktok / twitter / patreon ] [ GET / ref / marx / booru ]