163

Being anonymous is getting harder and harder (tilvids.com)

submitted 5 months ago by mesamunefire@lemmy.world to c/videos@lemmy.world

16 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[-] lvxferre@mander.xyz 48 points 5 months ago

It's actually worse.

The video focuses on how you're leaking personal info all the time through the software that you use and the connections that you make, and ways to mitigate it.

However, have you guys heard about forensic linguistics? That's how the Unabomber was caught. The way that you use your language(s) is pretty unique to yourself, and can be used to uncover your identity. This was done manually by two guys, Fitzgerald and Shuy; they were basically identifying patterns in how Unabomber wrote to narrow down the suspects further and further, until they hit the right guy.

Now, let's talk about large "language" models, like Gemini or ChatGPT. Frankly, I believe that people who think that LLMs are "intelligent" or "comprehend language" themselves lack intelligence and language comprehension. But they were made to find and match patterns in written text, and rather good at it.

Are you getting the picture? What Fitzgerald and Shuy did manually 30 years ago can be automated now. And it gets worse, note how those LLMs "happen" to be developed by companies that you can't trust to die properly (Google, Amazon, Facebook, Apple, Microsoft and its vassal OpenAI).

So, while the video offers some solid advice regarding privacy, sadly it is not enough. If you're in some deep shit, and privacy is a life-or-death matter for you, I strongly advise you be always mindful of what and how you write.

And, for the rest of us: fighting individually for our right to privacy is not enough. We need to assemble and organise ourselves, to fight on legal grounds against those who are trying to kill it. You either fight for your rights or you lose them.

Just my two cents. I apologise as this is just side-related to the video, but I couldn't help it.

[-] VeganCheesecake@lemmy.blahaj.zone 25 points 5 months ago

Conversely, you can now have your manifesto written by a locally run LLM.

[-] brbposting@sh.itjust.works 3 points 5 months ago

“Revise generically: [manifesto]” (certainly not the best prompt)

Folks seem to like Ollama per HackerNews threads: in a coding context here:

Not using Codestral (yet) but check out Continue.dev[1] with Ollama[2] running llama3:latest and starcoder2:3b. It gives you a locally running chat and edit via llama3 and autocomplete via starcoder2.

It's not perfect but it's getting better and better.

[1] https://www.continue.dev/ [2] https://ollama.com/

Please no unabombing though

Oh wow he wrote a 35k word manifesto… feel like that’s so rare you’d still stand a solid chance at being identified somehow.

[-] lvxferre@mander.xyz 1 points 5 months ago

You could, but even then you need to put some thought on how to prompt and review/edit the output.

I've noticed from usage that LLMs are extremely prone to repeat verbatim words and expressions from the prompt. So if you ask something like "explain why civilisation is bad from the point of view of a cool-headed logician", you're likely outing yourself already.

A lot of the times the output will have "good enough" synonyms. That you could replace with more accurate words... and then you're outing yourself already. Or simply how you fix it so it sounds like a person instead of a chatbot, we all have writing quirks and you might end leaking them into the review.

And more importantly you need to aware that it is an issue, and that you can be tracked based on how and what you write.

[-] UmeU@lemmy.world 13 points 5 months ago

While forensic linguistics is pretty cool, the Unabomber was caught because they released his manifesto and his brother’s wife and brother recognized the unusual phrasing such as ‘Eat your cake and have it too’.

If an author has a large amount of known works then it’s not too difficult to identify other writings by that same author. But if the author does not have a large body of writing that is known to come from that individual, then the best we can do is determine an approximate age and geographic location where the Individual grew up, and that’s only when the unidentified writing is large enough, like in the case of the Unabomber where his manifesto was 30k words.

[-] lvxferre@mander.xyz 1 points 5 months ago

I did simplify the whole thing, as you noticed; but note that his SIL and brother identifying him is another example of the same process, David knew that expressions that Ted used like "cool-headed logicians" were highly unusual, not too unlike what the socio- and forensic linguists did there.

But if the author does not have a large body of writing that is known to come from that individual

Such as a Lemmy or Facebook account? Or any other online account associated with your writing, really; we produce far more text in the internet than ourselves realise.

And while a priori, your different accounts through different websites might look completely disconnected, as you connect two of them as coming from the same person, connecting a third one is easier. And a fourth. So goes on.

A small caveat is that while the corpus is bigger, so is the noise introduced by people from the other side of the world that happen to use the same patterns as the person whom you want to identify. Even then, I believe that the ability to bulk process text to find authorship grew considerably faster than the number of potential matches.

[-] intensely_human@lemm.ee 0 points 5 months ago

Additional signal is not noise

[-] LodeMike@lemmy.today 6 points 5 months ago

Is there a best of on Lemmy

[-] StaticFalconar@lemmy.world 4 points 5 months ago

Best of misinformation. The Unabomber was caught because the person that recognized the writing was his own family members. What the hell is this revisionist BS?

[-] kionite231@lemmy.ca 3 points 5 months ago

https://lemmy.ca/c/bestoflemmy@lemmy.world

[-] CommunityLinkFixer@lemmings.world 4 points 5 months ago

Hi there! Looks like you linked to a Lemmy community using a URL instead of its name, which doesn't work well for people on different instances. Try fixing it like this: !bestoflemmy@lemmy.world

[-] AtariDump@lemmy.world 3 points 5 months ago* (last edited 5 months ago)

You can’t eat your cake and have it too.

[-] Pantoffel@feddit.de 2 points 5 months ago* (last edited 5 months ago)

Let's see how well this quote ages as subscription services become ubiquitous.

[-] AtariDump@lemmy.world 2 points 5 months ago

[-] lvxferre@mander.xyz 0 points 5 months ago

You can’t eat your cake and have it too.

And I can't "magically" know what you're referring to, either - given that you're replying to a rather long comment but providing exactly zero context on what specifically you're replying to.

Quotes, use them.

this post was submitted on 30 May 2024

163 points (97.7% liked)

Videos

14307 readers

85 users here now

For sharing interesting videos from around the Web!

Rules

Videos only
Follow the global Mastodon.World rules and the Lemmy.World TOS while posting and commenting.
Don't be a jerk
No advertising
No political videos, post those to !politicalvideos@lemmy.world instead.
Avoid clickbait titles. (Tip: Use dearrow)
Link directly to the video source and not for example an embedded video in an article or tracked sharing link.
Duplicate posts may be removed

Note: bans may apply to both !videos@lemmy.world and !politicalvideos@lemmy.world

founded 1 year ago

MODERATORS

Flutter@lemmy.world

Ghostalmedia@lemmy.world

can@sh.itjust.works

qaz@lemmy.world

qaz@lemmy.blahaj.zone