15
submitted 3 months ago by gregor@gregtech.eu to c/lemmy_support@lemmy.ml

Essentially, I'd like to have pictrs delete all of the images that aren't uploaded by my users, because my storage usage was going through the roof, so I just disabled the proxying of images. Here is my config:

x-logging: &default-logging
  driver: "json-file"
  options:
    max-size: "50m"
    max-file: "4"

services:
  proxy:
    image: docker.io/library/nginx
    volumes:
      - ./nginx_internal.conf:/etc/nginx/nginx.conf:ro,Z
      - ./proxy_params:/etc/nginx/proxy_params:ro,Z
    restart: always
    logging: *default-logging
    depends_on:
      - pictrs
      - lemmy-ui
    labels:
      - traefik.enable=true
      - traefik.http.routers.http-lemmy.entryPoints=http
      - traefik.http.routers.http-lemmy.rule=Host(`gregtech.eu`)
      - traefik.http.middlewares.https_redirect.redirectscheme.scheme=https
      - traefik.http.middlewares.https_redirect.redirectscheme.permanent=true
      - traefik.http.routers.http-lemmy.middlewares=https_redirect
      - traefik.http.routers.https-lemmy.entryPoints=https
      - traefik.http.routers.https-lemmy.rule=Host(`gregtech.eu`)
      - traefik.http.routers.https-lemmy.service=lemmy
      - traefik.http.routers.https-lemmy.tls=true
      - traefik.http.services.lemmy.loadbalancer.server.port=8536
      - traefik.http.routers.https-lemmy.tls.certResolver=le-ssl


  lemmy:
    image: dessalines/lemmy:0.19.8
    hostname: lemmy
    restart: always
    logging: *default-logging
    volumes:
      - ./lemmy.hjson:/config/config.hjson:Z
    depends_on:
      - postgres
      - pictrs
    networks:
      - default
      - database

  lemmy-ui:
    image: ghcr.io/xyphyn/photon:latest
    restart: always
    logging: *default-logging
    environment:
      - PUBLIC_INSTANCE_URL=gregtech.eu
      - PUBLIC_MIGRATE_COOKIE=true
#      - PUBLIC_SSR_ENABLED=true
      - PUBLIC_DEFAULT_FEED=All
      - PUBLIC_DEFAULT_FEED_SORT=Hot
      - PUBLIC_DEFAULT_COMMENT_SORT=Top
      - PUBLIC_LOCK_TO_INSTANCE=false



  pictrs:
    image: docker.io/asonix/pictrs:0.5
    # this needs to match the pictrs url in lemmy.hjson
    hostname: pictrs
    # we can set options to pictrs like this, here we set max. image size and forced format for conversion
    # entrypoint: /sbin/tini -- /usr/local/bin/pict-rs -p /mnt -m 4 --image-format webp
    #entrypoint: /sbin/tini -- /usr/local/bin/pict-rs run  --max-file-count 10  --media-max-file-size 500 --media-retention-proxy 10d --media-retention-variants 10d  filesystem sled -p /mnt
    user: 991:991
    environment:
      - PICTRS__STORE__TYPE=object_storage
      - PICTRS__STORE__ENDPOINT=https://s3.eu-central-003.backblazeb2.com/
      - PICTRS__STORE__BUCKET_NAME=gregtech-lemmy
      - PICTRS__STORE__REGION=eu-central
      - PICTRS__STORE__USE_PATH_STYLE=false
      - PICTRS__STORE__ACCESS_KEY=redacted
      - PICTRS__STORE__SECRET_KEY=redacted
      - PICTRS__MEDIA__RETENTION__VARIANTS=0d
      - PICTRS__MEDIA__RETENTION__PROXY=0d
      - PICTRS__SERVER__API_KEY=redacted_api_key
      #- PICTRS__MEDIA__IMAGE__FORMAT=webp
      #- PICTRS__MEDIA__IMAGE__QUALITY__WEBP=50
      #- PICTRS__MEDIA__ANIMATION__QUALITY=50
    volumes:
      - ./volumes/pictrs:/mnt:Z
    restart: always
    logging: *default-logging

  postgres:
    image: docker.io/postgres:16-alpine
    hostname: postgres
    volumes:
      - ./volumes/postgres:/var/lib/postgresql/data:Z
      #- ./customPostgresql.conf:/etc/postgresql.conf:Z
    restart: always
    #command: postgres -c config_file=/etc/postgresql.conf
    shm_size: 256M
    logging: *default-logging
    environment:
      - POSTGRES_PASSWORD=password
      - POSTGRES_USER=lemmy
      - POSTGRES_DB=lemmy
    networks:
      - database
  postfix:
    image: docker.io/mwader/postfix-relay
    restart: "always"
    logging: *default-logging

networks:
  default:
    name: traefik_access
    external: true
  database:
top 7 comments
sorted by: hot top controversial new old
[-] vk6flab@lemmy.radio 3 points 3 months ago

Not for nothing, there are legal reasons to remove these as well. Think about illegal images, gdpr requirements and other such liabilities.

[-] stinky@redlemmy.com 2 points 3 months ago

how did you figure out you were having storage issues?

[-] gregor@gregtech.eu 1 points 3 months ago

Well, my storage usage was rapidly increasing, you like 340GB. I tried setting up the env vars that delete them after some time (4 days in my case) but it just ignored the config.

[-] stinky@redlemmy.com 2 points 3 months ago

i'm trying to learn more about hosting my own instance

are you willing to share your setup? docker/self hosted? storage is something i understand so i'm starting with that

[-] gregor@gregtech.eu 5 points 3 months ago

Here is my setup (make sure to read the comments). If you have any additional questions please ask, I love helping people with sysadmin stuff. So, I use Traefik as a reverse proxy, here's its docker compose: services:

services:

  traefik:
    image: traefik:latest
    ports:
      - 80:80 
      - 443:443
      - 8080:8080 # for Matrix
    environment:
      - CF_DNS_API_TOKEN=redacted # for my Cloudflare setup, you probably wouldn't need Cloudflare
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - ./traefik.yml:/etc/traefik/traefik.yml
      - ./acme.json:/acme.json
      - ./routes/:/routes
networks:
  default:
    name: traefik_access # I have this so that I can simply put a networks block at the bottom of other docker compose files and traefik can then access it and proxy the traffic, if I put a labels block in the services I want to proxy, I'll also provide that.

Next, we have my configuration file for Traefik, traefik.yml:

certificatesResolvers:
  le-ssl: #for ssl certs
    acme:
      email: gregor@gregtech.eu
      storage: /acme.json
      dnsChallenge:
        provider: cloudflare #for cloudflare, you might want to change this somehow if you aren't going to use cloudflare, you can ask me if you need any help with this
        resolvers:
          - "1.1.1.1:53"
          - "1.0.0.1:53"

providers:
  docker:
    exposedByDefault: false
    network: traefik_access

log:
  level: DEBUG
accessLog: {}

api:
  dashboard: false
  insecure: false

entryPoints:
  http:
    address: ":80"
  https:
    address: ":443"

Not much to say about that one, following up we have my lemmy docker compose file, which I dare say is highly chaotic (I will still write the code comments tho):

x-logging: &default-logging
  driver: "json-file"
  options:
    max-size: "50m"
    max-file: "4"

services:
  proxy:
    image: docker.io/library/nginx
    volumes:
      - ./nginx_internal.conf:/etc/nginx/nginx.conf:ro,Z
      - ./proxy_params:/etc/nginx/proxy_params:ro,Z
    restart: always
    logging: *default-logging
    depends_on:
      - pictrs
      - lemmy-ui
    labels: #the ugly part: proxying. Here traefik proxies the traffic to nginx, of which the config I will provide.
      - traefik.enable=true
      - traefik.http.routers.http-lemmy.entryPoints=http
      - traefik.http.routers.http-lemmy.rule=Host(`gregtech.eu`)
      - traefik.http.middlewares.https_redirect.redirectscheme.scheme=https
      - traefik.http.middlewares.https_redirect.redirectscheme.permanent=true
      - traefik.http.routers.http-lemmy.middlewares=https_redirect
      - traefik.http.routers.https-lemmy.entryPoints=https
      - traefik.http.routers.https-lemmy.rule=Host(`gregtech.eu`)
      - traefik.http.routers.https-lemmy.service=lemmy
      - traefik.http.routers.https-lemmy.tls=true
      - traefik.http.services.lemmy.loadbalancer.server.port=8536
      - traefik.http.routers.https-lemmy.tls.certResolver=le-ssl


  lemmy:
    image: dessalines/lemmy:0.19.8
    hostname: lemmy
    restart: always
    logging: *default-logging
    volumes:
      - ./lemmy.hjson:/config/config.hjson:Z
    depends_on:
      - postgres
      - pictrs
    networks:
      - default
      - database

  lemmy-ui:
    image: ghcr.io/xyphyn/photon:latest # The lemmy UI accessible at gregtech.eu is actually not the official one, it's Photon
    restart: always
    logging: *default-logging
    environment:
      - PUBLIC_INSTANCE_URL=gregtech.eu
      - PUBLIC_MIGRATE_COOKIE=true
      - PUBLIC_DEFAULT_FEED=All
      - PUBLIC_DEFAULT_FEED_SORT=Hot
      - PUBLIC_DEFAULT_COMMENT_SORT=Top
      - PUBLIC_LOCK_TO_INSTANCE=false



  pictrs:
    image: docker.io/asonix/pictrs:0.5
    # this needs to match the pictrs url in lemmy.hjson
    hostname: pictrs
    # we can set options to pictrs like this, here we set max. image size and forced format for conversion
    # entrypoint: /sbin/tini -- /usr/local/bin/pict-rs -p /mnt -m 4 --image-format webp
    #entrypoint: /sbin/tini -- /usr/local/bin/pict-rs run  --max-file-count 10  --media-max-file-size 500 --media-retention-proxy 10d --media-retention-variants 10d  filesystem sled -p /mnt
    user: 991:991
    environment:
      - PICTRS__STORE__TYPE=object_storage #well, now comes the storage. I use backblaze , though you can also simply store them on-disk, or on another S3 storage provider.
      - PICTRS__STORE__ENDPOINT=https://s3.eu-central-003.backblazeb2.com/
      - PICTRS__STORE__BUCKET_NAME=gregtech-lemmy
      - PICTRS__STORE__REGION=eu-central
      - PICTRS__STORE__USE_PATH_STYLE=false
      - PICTRS__STORE__ACCESS_KEY=redacted
      - PICTRS__STORE__SECRET_KEY=redacted
      - PICTRS__MEDIA__RETENTION__VARIANTS=0d
      - PICTRS__MEDIA__RETENTION__PROXY=0d
      - PICTRS__SERVER__API_KEY=redacted #needed if you want to delete images

    volumes:
      - ./volumes/pictrs:/mnt:Z
    restart: always
    logging: *default-logging

  postgres:
    image: docker.io/postgres:16-alpine
    hostname: postgres
    volumes:
      - ./volumes/postgres:/var/lib/postgresql/data:Z
    restart: always
    shm_size: 256M
    logging: *default-logging
    environment:
      - POSTGRES_PASSWORD=password
      - POSTGRES_USER=lemmy
      - POSTGRES_DB=lemmy #this is just the db password
    networks:
      - database
  postfix:
    image: docker.io/mwader/postfix-relay
    restart: "always"
    logging: *default-logging


networks:
  default:
    name: traefik_access #allows traefik to access these services for proxying
    external: true
  database:

And here's the nginx config, (nginx_internal.conf) I kinda forgot how it works lmao I wrote it like half a year ago:

worker_processes auto;

events {
    worker_connections 1024;
}

http {
    # Docker internal DNS IP so we always get the newer containers without having to 
    # restart/reload the docker container / nginx configuration
    resolver 127.0.0.11 valid=5s;

    # set the real_ip when from docker internal ranges. Ensuring our internal nginx
    # container can always see the correct ips in the logs
    set_real_ip_from 10.0.0.0/8;
    set_real_ip_from 172.16.0.0/12;
    set_real_ip_from 192.168.0.0/16;

    # We construct a string consistent of the "request method" and "http accept header"
    # and then apply soem ~simply regexp matches to that combination to decide on the
    # HTTP upstream we should proxy the request to.
    #
    # Example strings:
    #
    #   "GET:application/activity+json"
    #   "GET:text/html"
    #   "POST:application/activity+json"
    #
    # You can see some basic match tests in this regex101 matching this configuration
    # https://regex101.com/r/vwMJNc/1
    #
    # Learn more about nginx maps here http://nginx.org/en/docs/http/ngx_http_map_module.html
    map "$request_method:$http_accept" $proxpass {
        # If no explicit matches exists below, send traffic to lemmy-ui
        default "http://lemmy-ui:3000/";

        # GET/HEAD requests that accepts ActivityPub or Linked Data JSON should go to lemmy.
        #
        # These requests are used by Mastodon and other fediverse instances to look up profile information,
        # discover site information and so on.
        "~^(?:GET|HEAD):.*?application\/(?:activity|ld)\+json" "http://lemmy:8536/";

        # All non-GET/HEAD requests should go to lemmy
        #
        # Rather than calling out POST, PUT, DELETE, PATCH, CONNECT and all the verbs manually
        # we simply negate the GET|HEAD pattern from above and accept all possibly $http_accept values
        "~^(?!(GET|HEAD)).*:" "http://lemmy:8536/";
    }

    server {
        set $lemmy_ui "lemmy-ui:3000"; #these are the internal ports for the services
        set $lemmy "lemmy:8536";
        # this is the port inside docker, not the public one yet
        listen 1236;
        listen 8536;

        # change if needed, this is facing the public web
        server_name localhost;
        server_tokens off;

        # Upload limit, relevant for pictrs
        client_max_body_size 20M;

        # Send actual client IP upstream
        include proxy_params;

        # frontend general requests
        location / {
            proxy_pass $proxpass;
            rewrite ^(.+)/+$ $1 permanent;
        }

        # security.txt
        location = /.well-known/security.txt {
            proxy_pass "http://$lemmy_ui";
        }

        # backend
        location ~ ^/(api|pictrs|feeds|nodeinfo|.well-known|version|sitemap.xml) {
            proxy_pass "http://$lemmy";

            # Send actual client IP upstream
            include proxy_params;
        }
    }
}

And finally, we have lemmy's config, lemmy.hjson:

{
  # for more info about the config, check out the documentation
  # https://join-lemmy.org/docs/en/administration/configuration.html
#make sure to check this out
  database: {
    host: postgres
    password: "password"
    # Alternative way:
    #uri: "postgresql://lemmy:{{ postgres_password }}@postgres/lemmy"
  }
  hostname: "gregtech.eu"
  pictrs: {
    url: "http://pictrs:8080/"
    api_key: "redacted"
    image_mode: "None" #to not proxy images, can be changed if you'd like to have the images proxied by your server

  }
  email: { #for sending email
    smtp_server: "smtp.sendgrid.net:587"
    smtp_from_address: "lemmy@lemmy.gregtech.eu"
    tls_type: "starttls"
    smtp_login: "redacted"
    smtp_password: "redacted"
  }
   # Parameters for automatic configuration of new instance (only used at first start)
  setup: {
    # Username for the admin user
    admin_username: "gregor"
    # Password for the admin user. It must be between 10 and 60 characters.
    admin_password: "redacted"
    # Name of the site, can be changed later. Maximum 20 characters.
    site_name: "Gremmy"
    # Email for the admin user (optional, can be omitted and set later through the website)
    admin_email: "redacted"
  }
}

[-] stinky@redlemmy.com 3 points 3 months ago

this is incredible, thanks <3

federation interests me but I'm a lazy noob. I'd like to "find instances that I'm not federated with" or "migrate all of my Subscriptions to another account" or "find communities that have been created since I last checked"

I host with Elestio and I think that development of the underlying tech may not be easy while my setup is tightly managed by them

but I will save your post and come back here with questions

[-] bjoern_tantau@swg-empire.de 1 points 1 month ago

Did you ever find a solution for this? I was able to delete some of the files by deleting every image in the local_image table without a local_user_id but that catches only the stuff that is in the table.

Everything that was uploaded before that table existed won't be deleted.

this post was submitted on 30 Dec 2024
15 points (100.0% liked)

Lemmy Support

4816 readers
12 users here now

Support / questions about Lemmy.

Matrix Space: #lemmy-space

founded 6 years ago
MODERATORS