28
$ cd lemmy-dir
$ du -sh *
456K    lemmy-ui
15G     pictrs
4.3G    postgres

Guys this is no longer funny please I feel literally chased by the "no space left" message. Please help I don't need those pics I did not upload them

top 16 comments
sorted by: hot top controversial new old
[-] clement@social.poisson.me 7 points 1 year ago

Have you posted this question on the lemmy_admin community over on lemmy.ml? Or possibly joined their matrix chat as linked on their github project? I suspect you will be able to get much more targeted support directly from the team or their community rather than the selfhosted community which is more general to all kinds of self hosting.

[-] maor@lemmy.org.il 4 points 1 year ago

Thanks a lot, I was looking for this exact kind of community. Posted there <3

[-] xusontha@ls.buckodr.ink 1 points 1 year ago
[-] maor@lemmy.org.il 2 points 1 year ago* (last edited 1 year ago)

Okay, you may not gonna like it but I rented a 1TB storage box from Hetzner for 3 euros a month, just to get that foot off my neck. It's omega cheap and mountable via CIFS so life is good for now. I'm still interested in what I described in the OP, and I even started scribbling some Python, but I'm too scared of fucking anything up as of now.

The annoying part in writing that script was discovering that the filenames on disk don't match the filenames in the URLs. E.g., given this URL:
https://lemmy.org.il/pictrs/image/e6a0682b-d530-4ce8-9f9e-afa8e1b5f201.png. You'd expect that somewhere inside volumes/pictrs you'd find e6a0682b-d530-4ce8-9f9e-afa8e1b5f201.png, right...? So that's not how it works, the filenames are of the exact same format but they don't match.

So my plan was to find non-local posts from the post table, check whether the thumbnail_url column starts with lemmy.org.il (assuming that means my instance cached it), then finding the file by downloading it via the URL and scanning the pictrs directory for files that match the exact size in bytes of the downloaded files. Once found, compare their checksums to be sure it's the same one, then delete it and delete its post entry in the database.

When get close to 1TB I'll get back here for this idea... :P

[-] maor@lemmy.org.il 2 points 1 year ago

Haha I'm literally on it right now. My instance crashed a couple of hours ago because of it, so I emptied ~/.rustup to get some time, but idk how to go about it from here. LPP didn't do anything. That seems really curious, does literally everyone use S3?

[-] sneezycat@sopuli.xyz 6 points 1 year ago

Sort by date created and delete oldest? Idk, I have no clue how Lemmy self-hosting works, but I guess that any picture you delete is a post that will be missing a picture.

Best solution? Just download more RAM 😉

[-] maor@lemmy.org.il 2 points 1 year ago

I should've mentioned it in the post, but I already tried deleting pics modified more than X days ago. The catch is that I don't wanna delete pics uploaded to my server, I just want to delete pocs cached from other instances :(

[-] willya@lemmyf.uk 3 points 1 year ago* (last edited 1 year ago)
[-] iso@lemy.lol 3 points 1 year ago

They're thumbnails of other instance posts. I suggest migrating pictrs to the S3 for cheaper/easier storage.

[-] dan@upvote.au 6 points 1 year ago* (last edited 1 year ago)

S3 isn't always cheaper though... It's highly redundant storage (multiple copies in multiple data centers) so it's often going to cost more than a single copy on a single VPS or dedicated server or whatever. I guess in some cases it might end up cheaper compared to upgrading your storage to something larger though.

If you do want to migrate your images "to the cloud", Backblaze B2 should end up cheaper than S3.

[-] iso@lemy.lol 1 points 1 year ago* (last edited 1 year ago)
  • You don’t pay for storing on multiple servers. I never saw something like this on any provider I know.
  • Upgrading storage is not cheaper. Instance media storage reaches 500GB in a month and S3 is always cheaper than data volumes with given options for pictrs.
  • Backblaze is not cheapest. It has egress fees so it will cost much more than others. Although its cheaper than AWS.
[-] dan@upvote.au 1 points 1 year ago* (last edited 1 year ago)

You don’t pay for storing on multiple servers.

For services like S3, it's included in the price.

Instance media storage reaches 500GB in a month and S3 is always cheaper than data volumes

Not sure where you got the idea that S3 would always be cheaper. $5/TB/month is a standard benchmark price for storage "in the cloud", and S3 is way more than that.

As an example, a Hetzner storage box is around $3.50/month (+ VAT if you're in Europe) for 1TB of space with unlimited traffic. The same amount of space with S3 is $23/month, plus the traffic.

For caches of media files, you don't need redundant storage like what S3 provides. You can save money by using a cheaper option.

Backblaze is not cheapest.

I didn't say it was the cheapest, just that it's cheaper than S3. Cheapest would probably be a Kimsufi server or something similar.

[-] iso@lemy.lol 1 points 1 year ago
[-] dan@upvote.au 1 points 1 year ago

This is for an SSD-based volume, which you really don't need for media storage. If you're using Hetzner, just get a storage box.

[-] Decronym@lemmy.decronym.xyz 1 points 1 year ago* (last edited 1 year ago)

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

Fewer Letters More Letters
NUC Next Unit of Computing brand of Intel small computers
SSD Solid State Drive mass storage
VPS Virtual Private Server (opposed to shared hosting)

3 acronyms in this thread; the most compressed thread commented on today has 8 acronyms.

[Thread #73 for this sub, first seen 21st Aug 2023, 23:45] [FAQ] [Full list] [Contact] [Source code]

[-] nick@campfyre.nickwebster.dev 1 points 1 year ago* (last edited 1 year ago)

I ran this query:

select distinct thumbnail_url as url from post where not local and thumbnail_url like 'https://campfyre.nickwebster.dev/pictrs%'

(replace with your instance's url)

I then sent delete requests to /internal/purge on pictrs to delete all of those old thumbnails, which cleared out a lot of space. After deleting the thumbnails I ran an UPDATE query to set all of those old thumbnail URLs to null in the DB. I also patched the version of lemmy that I run to stop caching thumbnails in the future. Hope this helps!

this post was submitted on 15 Aug 2023
28 points (85.0% liked)

Selfhosted

39677 readers
204 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS