598

This post knows where you're viewing it from (Lemmy doesn't proxy external images) [ARCHIVED] (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by TriLinder@lemmy.ml to c/privacy@lemmy.ml

142 comments fedilink hide all child comments

Note: This post now archived and as such no longer works

An external image showing your user-agent and the total "hit count"

you are viewing a single comment's thread
view the rest of the comments

[-] TriLinder@lemmy.ml 199 points 2 years ago

This is possible because Lemmy doesn't proxy external images but instead loads them directly. While not all that bad, this could be used for Spy pixels by nefarious posters and commenters.

Note, that the only thing that I willingly log is the "hit count" visible in the image, and I have no intention to misuse the data.

[-] targetx@programming.dev 59 points 2 years ago

Nice example!

I think proxying everything through lemmy would have a pretty big bandwidth/scalability impact. I expect the lemmy clients dont send any unique user info on these image requests so not sure how useful it would be as a spy pixel? Maybe I'm missing something :-)

[-] goddard_guryon@sopuli.xyz 17 points 2 years ago* (last edited 2 years ago)

It would be interesting to see just how much info is shared when lemmy requests the image. If there is [potentially] sensitive info being shared, the devs might be interested in working on it too (I have no idea how to check such a thing, this comment is just so I can find the post later when more people have shared their wisdom on it)

[-] muddybulldog@mylemmy.win 36 points 2 years ago* (last edited 2 years ago)

None (by Lemmy), as Lemmy doesn't actually request the image (that would be proxying). Your browser requests the image directly by URL. Lemmy, technically, doesn't even know an image exists. It just provides the HTML and lets your browser do the work.

[-] A_A@lemmy.world 17 points 2 years ago* (last edited 2 years ago)

Exactly. The text of this post is simply :

![An external image showing your user-agent and the total "hit count"](https://trilinder.pythonanywhere.com/image.jpg)
I get the same result when I browse directly to the link.

So, if OP links a malcious website we have a problem ... (?).

[-] goddard_guryon@sopuli.xyz 10 points 2 years ago

Oh dangit, it's simpler than I thought. So the only data being sent is...just whatever is sent in your average GET request.

[-] newIdentity@sh.itjust.works 13 points 2 years ago

Yes. It's also a pretty standard way of serving images. A lot of Email clients do that too.

That's also how these services that show you when a email is read work.

[-] newIdentity@sh.itjust.works 7 points 2 years ago* (last edited 2 years ago)

Not really that huge of a problem. When making requests you also usually send a header which includes the user agent.

The program just logs how many times the image has been requested and it reads the user agent data. No Javascript is actually executed.

Well it might be possible to have a XSS somehow but I haven't really done much research into this possibility.

In general it's a pretty standard way of handling embedded images. Email does this too. That's how you have these services that can check if someone read a mail

[-] CoderKat@lemm.ee 4 points 2 years ago

Yup. And to add, your browser will send things like:

Your IP address. Technically this is sent by the OS doing networking and is unavoidable. At best, a VPN can hide this, because the VPN sits in the middle.
Various basic request headers, which most notably contains user agent (identifies browser) and language headers, both which you can fake if you want to.
Cookies for that domain (if you have any). Those can track you across multiple requests and thus build up a profile of you.

[-] odbol@lemmy.world 1 points 2 years ago

That's why you should use a native app, which won't send any of that identifying info (except for IP but there's nothing you can do on that)

[-] ono@lemmy.ca 24 points 2 years ago* (last edited 2 years ago)

Notably, this allows remote parties to associate your IP address with your interests, as revealed by the Lemmy communities that you browse.

One way is for the image host to use the HTTP Referer field. (Standards-respecting web browsers pass the URL of the web page being viewed to the server hosting the image.)

Another way is by posting an image with a unique URL.

Even if Referer is withheld and the image is not unique, the image host can still do basic fingerprinting of your client's request header and your OS's TCP quirks, and associate that fingerprint with your IP address.

An option for Lemmy to proxy media would be very helpful. Small instances could perhaps disable it, although they might not need to, since the additional load would scale with the number of users on that instance.

[-] PoliticalAgitator@lemm.ee 7 points 2 years ago* (last edited 2 years ago)

Notably, this allows remote parties to associate your IP address with your interests, as revealed by the Lemmy communities that you browse.

I suspect with a coordinated pool of posts or multiple comments on the same post, you could narrow that IP address down to an actual user account.

When a new comment is posted by a user, store, against their username, all IP addresses that visited since the last comment in that thread (by anyone). When a second comment is posted by a user, remove any IP addresses that don't appear in both lists.

I suspect you would have a very short list after two comments, and a single address after 3. It would also be extremely easy to both lure someone into viewing an image and bait them into multiple replies. Geolocate that IP and you know know vaguely where that user lives.

Time to make sure you're always on a VPN I guess.

[-] TriLinder@lemmy.ml 4 points 2 years ago

You could also send the image through a DM if you want to find a particular user

[-] PoliticalAgitator@lemm.ee 1 points 2 years ago

Oh yeah, that'd be much less effort.

[-] ono@lemmy.ca 3 points 2 years ago

Even without that, once your Lemmy interests are sold/shared by IP address, they can be associated with your real identity as soon as you log in to a service that knows who you are.

[-] lazylion_ca@lemmy.ca 17 points 2 years ago

Were you expecting otherwise? Loading an external image is no different than loading an external website with images. Lemmy and reddit are link aggregators, not proxies. Having to proxy everything would run a significant bandwidth for instance admin who are often paying out of pocket for hosting.

[-] Seraph@kbin.social 6 points 2 years ago* (last edited 2 years ago)

Any chance that's why this account is posting the same image and gibberish? @googa

[-] Erika2rsis@lemmy.blahaj.zone 6 points 2 years ago

From what I remember, that image was hosted on hexbear.net, so I don't think so.

[-] Anticorp@lemmy.ml 2 points 2 years ago* (last edited 2 years ago)

How do you get an image to run code? I guess I somehow missed something important in website development.

Edit: I saw that you said you're using Pillow to actually render the image from code. That's neat! ...and scary

[-] roon@lemmy.ml 0 points 2 years ago

Share source code? I'm curious

[-] TriLinder@lemmy.ml 2 points 2 years ago

It's just a simple Flask server. I parse the user-agent using the user_agents Python library, apply some conditionals upon the result, render the image using Pillow and send it to the user.

this post was submitted on 11 Aug 2023

598 points (96.6% liked)

Privacy

45284 readers

205 users here now

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

Posting a link to a website containing tracking isn't great, if contents of the website are behind a paywall maybe copy them into the post
Don't promote proprietary software
Try to keep things on topic
If you have a question, please try searching for previous discussions, maybe it has already been answered
Reposts are fine, but should have at least a couple of weeks in between so that the post can reach a new audience
Be nice :)

Related communities

much thanks to @gary_host_laptop for the logo design :)

founded 6 years ago

MODERATORS