206

Looking to maybe self host my own instance, I'm still learning about the fediverse. If a different instance that I federate with hosts something illegal are there risks to me? Is anything from other instances hosted on my server like a copy of it? Or would I only end up hosting things my users post? I'm paranoid and sorry if this is a silly question.

top 50 comments
sorted by: hot top controversial new old
[-] techie@techy.news 105 points 1 year ago

The Electronic Frontier Foundation wrote a pretty good blog post on the legality of the Fediverse, around the time Mastodon was getting popular. It probably applies to Lemmy too. It’s worth a read to familiarize yourself of what kind of legal things you’ll be getting yourself into. You’re on the right track; you can control you and your friends’ content, but you can’t control remote content that gets pushed to your server and that’s the part to worry the most.

https://www.eff.org/deeplinks/2022/12/user-generated-content-and-fediverse-legal-primer

One thing that stood out is to register yourself as a DMCA agent. It costs $6 or something. Having an agent on record gives instance admins certain protection.

[-] erre@feddit.win 9 points 1 year ago

This is awesome info. There should be a place to document all the nuance around hosting an instance plus some tips and tricks.

[-] Spzi@lemm.ee 16 points 1 year ago

There should be a place to document all the nuance around hosting an instance plus some tips and tricks.

The Wiki: https://joinfediverse.wiki/What_is_Lemmy%3F

Hopefully it gets new contributors and maintainers from all the new users.

[-] erre@feddit.win 5 points 1 year ago

Yikes, I didn't even know there was a wiki. Thank you!

[-] silver@lemmy.brendan.ie 6 points 1 year ago

Thank ye,

I wonder how much of an impact being in the EU will have on that.

[-] LunchEnjoyer@lemmy.world 5 points 1 year ago

Awesome, thanks 🤘

[-] VitaminH@lemmy.world 5 points 1 year ago

This is great info, thank you!

[-] frozen@lemmy.frozeninferno.xyz 61 points 1 year ago* (last edited 1 year ago)

Text is copied to your instance's database, but any images are hosted on the other instances and simply linked to. Worst case scenario, you get told to delete something that's illegal in the country in which you host the instance, you comply, and everything's peachy.

Edit: That being said, I'm currently hosting an instance for myself and a few friends, and it's been smooth-sailing. Just make sure to require email verification or admin approval for new sign-ups (or disable them entirely) if you don't want to be overrun with bots.

[-] VitaminH@lemmy.world 17 points 1 year ago

Yes I'd only be allowing people I know personally to create accounts. No other registration would be allowed. The last thing I'd want to be is another one of those bot filled instances that have been popping up.

[-] lodion@aussie.zone 14 points 1 year ago

That isn't entirely true. I'm not exactly sure why, but I've definitely seen image posts made to remote communities that are hosted on my instance.

[-] SJ_Zero@lemmy.fbxl.net 13 points 1 year ago

I've seen image posts I make on other instances, but the image is hosted on my own instance and just linked to the other instance.

[-] lodion@aussie.zone 7 points 1 year ago

That may explain it... point being, content for remote communities isn't entirely "remote". I'd like to understand what goes where a lot better. I've not found it explained anywhere, and I'm not a coder so can't just "read the code yourself".

[-] SJ_Zero@lemmy.fbxl.net 6 points 1 year ago

I'm not familiar with lemmy, but I did pick up on the lotide code a bit recently (a similar project)

As I understand it, the text or html of the post end up in a sort of mailbox, then your server goes out to pick up the latest posts from there. It gets brought over to your instance, and then it lives there. Whatever happens, the posts your server collected are on your server, that's how they're displayed.

Then when you go to write a post, it's stored locally and if it's on a local community then it's stored there and a copy is sent to the mailbox for others, and if it's a remote community your server will reach out to the other server and drop the post there.

My lotide instance has some older posts from servers that stopped existing a long time ago because although it can't get in touch with the remote community, the posts it did receive are still there.

[-] lodion@aussie.zone 2 points 1 year ago

Yeah post content I understand. Linked or posted images though are not consistently handled, so I'm not sure what circumstances lead to my instance pulling the image from a remote community.

[-] SJ_Zero@lemmy.fbxl.net 2 points 1 year ago

I don't think lemmy typically does. I'm often on networks that block a lot of the Internet, and even thumbnails on posts from other instances or their community images get blocked when I can't communicate with them.

Right now, your profile pic for example is coming from aussie.zone, and the community pic is coming from lemmy.world, but I'm on fbxl lemmy a completely different instance from either of them.

load more comments (6 replies)
[-] SJ_Zero@lemmy.fbxl.net 21 points 1 year ago

If you're in the US, The Communications Decency Act Section 230 has a couple powers.

  1. It removes liability to service providers for user generated content when active moderation is practiced, and

  2. It removes liability to service providers for any moderation actions taken to to moderate to reasonable community standards.

Prior to CDA230, the jurisprudence centered around 2 different cases. In one, an actively moderated system had illegal content and didn't remove it in time, and in another case, a non-actively moderated system had illegal content and didn't remove it in time. At that time, the actively moderated system was held to be liable for the illegal content, whereas the non-actively moderated system was held not to be liable for not removing the illegal content.

[-] SJ_Zero@lemmy.fbxl.net 10 points 1 year ago

One caveat to that would be the DMCA, where liability protection as a service provider I think is contingent on there being a DMCA process available so infringing content can be removed.

I don't know enough about how that all works with the fediverse, however.

[-] CriticalMiss@lemmy.world 7 points 1 year ago

The fediverse is still a relatively small thing, even with all the popularity it’s been getting.

So dmcas are yet to happen

[-] jeena@jemmy.jeena.net 20 points 1 year ago

Federation is implemented by copying the content from other servers to your database and file system, so if your users subscribe to something from a different server it will be copied to your server.

But it will be only served to your users, not to the public. Only the communities hosted on your instance will be served to the public.

[-] pe1uca@lemmy.pe1uca.dev 7 points 1 year ago

AFAIK only text is copied, media stays in the instance where the community is hosted.

[-] petunia@lemmy.world 4 points 1 year ago

It depends on the software. Some proxy all content from remote servers so you only connect to your home server (Mastodon). Others don't, instead they make clients load remote content themselves (Lemmy). If you use browser client you can see all the connections being made.

[-] pe1uca@lemmy.pe1uca.dev 4 points 1 year ago

Yes, depends on the software, the post is about lemmy so I was talking about lemmy

[-] Hellsadvocate@kbin.social 3 points 1 year ago

Interesting so I can't visit a Lemmy community as a magazine within kbin if I don't have an account?

[-] Kaldo@kbin.social 3 points 1 year ago

You can if someone else subscribed to it in the past. If nobody ever did, then that community is unknown to kbin and you won't find any data on it whether you're logged in or not.

[-] Hellsadvocate@kbin.social 2 points 1 year ago* (last edited 1 year ago)

In that case, how often is the lemme community updated on the kbin instance? Does it download updates every time the user visits?

load more comments (1 replies)
[-] GataZapata@kbin.social 1 points 1 year ago

But you can discover it for your instance, no?

[-] Kaldo@kbin.social 3 points 1 year ago

Yep, you can search by name specifically on https://kbin.social/search (or your kbin instance) and subscribe to it, then it starts getting synced.

[-] jeena@jemmy.jeena.net 3 points 1 year ago

Yes you can, even on Mastodon, and you can subscribe to PeerTube channels in /kbin and Lemmy, etc.

[-] vynlwombat@lemmy.world 8 points 1 year ago

How much disk space would some need to plan for a small lemmy instance?

[-] pe1uca@lemmy.pe1uca.dev 9 points 1 year ago

I'm running it in the smallest VPS of vultr with 25GB of disk.
This instance only has 3 users, with me being the only active. It says it's been up for almost a month and I've only used 3GB.

Here are the docker volumes which have the actual data of your instance, and from inside the DB the biggest table is the one called activity which the devs said it's only sometimes used to validate the data, but could be truncated if needed (there's a schedule task which only keeps up to 6 months).
Also the thing to have in mind is to properly configure the logs of whichever installation guide you follow.
After that I've seen other admins say the next biggest is the media uploaded (from bigger instances).

$ du -h --max-depth=1
640K    ./pictrs
3.2G    ./postgres
3.2G    .

lemmy=# select
  table_name,
  pg_size_pretty(pg_relation_size(quote_ident(table_name))),
  pg_relation_size(quote_ident(table_name))
from information_schema.tables
where table_schema = 'public'
order by 3 desc;
         table_name         | pg_size_pretty | pg_relation_size
----------------------------+----------------+------------------
 activity                   | 2187 MB        |       2292867072
 comment                    | 56 MB          |         58212352
 person                     | 48 MB          |         50307072
 comment_like               | 45 MB          |         47161344
 post_like                  | 22 MB          |         22781952
 comment_aggregates         | 14 MB          |         14811136
 post                       | 13 MB          |         13623296
[-] gabe565@lemmy.cook.gg 6 points 1 year ago

The activity table is also used to deduplicate incoming federation data, so instead of truncating it, I'd suggest deleting rows after a certain amount of time.

For my personal instance, I set up a cron to delete entries older than 3 days, and my db is only ~500MB with a few weeks of content! I also haven't seen any duplicated posts or comments. Even with Lemmy's retries, 3 days seems to be long enough before dropping rows from that table.

[-] ipkpjersi@lemmy.one 2 points 1 year ago* (last edited 1 year ago)

Could you share the cron/script you use to do this? I'm interested in hosting my own Lemmy at some point, and having a script for that cleanup would be hugely helpful for me.

[-] gabe565@lemmy.cook.gg 2 points 1 year ago

Definitely! I'm hosting in Kubernetes so I won't post the full thing, but here's the actual command that I run hourly. Make sure to replace the values for database, username, and password.

PGPASSWORD=password psql --dbname=database --username=username --command="DELETE FROM activity WHERE published < NOW() - INTERVAL '3 days';"
load more comments (2 replies)
[-] Thief@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

Can you help me set this up also or share the script I would run to do this? Many thanks.

[-] gabe565@lemmy.cook.gg 1 points 1 year ago

Sure! My script will look a little different since I'm hosting Lemmy in Kubernetes, but basically you will want to run the following command hourly. Make sure to replace the values for database, username, and password.

PGPASSWORD=password psql --dbname=database --username=username --command="DELETE FROM activity WHERE published < NOW() - INTERVAL '3 days';"
load more comments (2 replies)
[-] Thief@lemmy.myserv.one 1 points 1 year ago

Hi - can you help me set this up or share the script that you use to do this? Many thanks :)

[-] pe1uca@lemmy.pe1uca.dev 1 points 1 year ago

Ah! I didn't know exactly what was being used for.
Yeah, then it can only be trimmed, not truncated.

This is a great idea, thank you!

load more comments (1 replies)
[-] erre@feddit.win 3 points 1 year ago* (last edited 1 year ago)

How are you keeping your pictrs directory so small?

Mine is at about 5GB after two weeks with just a single user. 😬

[-] codus@leby.dev 1 points 1 year ago

I also have around 3GB used for pictrs and I’m not really sure the best way to see what all content is in there.

[-] erre@feddit.win 1 points 1 year ago

Yeah I haven't uploaded any images on my instance myself. So none of those images are mine. Might do some reading tomorrow and see if there's any mention of this in the past on other communities. It's not an emergency but I'm curious.

That's strange. Please let me know what you find out.

[-] erre@feddit.win 2 points 1 year ago

I had found an old post which indicates that post thumbnails are cached. So I guess there's that.

In case you didn't see it, the OP of this thread realized they didn't setup their pictrs API key.. so I guess it's possible to omit that and lemmy should still work. Not sure about the downsides.

[-] pe1uca@lemmy.pe1uca.dev 1 points 1 year ago

Haha, I don't know xP.
Just checked and it has only one image.

[-] erre@feddit.win 1 points 1 year ago

Did you configure the pictrs API keys for Lemmy and for pictrs?

If they're not configured then I could see Lemmy not even using pictrs.

load more comments (2 replies)
[-] Steunarde@kbin.social 3 points 1 year ago

Well, here's my first post on the fediverse!

Background in IT and server administration here. I however do not know much about the intricacies of the fediverse, but am interested in learning. Here's my two cents based on a background of LAMP stacks for web hosting.

The required space would likely scale and vary greatly depending on how much content is hosted locally. Assuming minimum space similar to a basic LAMP server it'd likely have starting space requirements of less than 1GB. If local content is primarily text/links to content hosted elsewhere it would take a lot to drastically change that space requirement. Image hosting can vary greatly depending on size, quality, and number of images. Video hosting is an absolute space hog even at fairly low resolutions by today's standards.

Bandwidth requirements would scale similar to storage requirements.

Other specs would also start very low if fediverse requirements are similar to a LAMP stack. Cores are typically more important than core speed in web server hosting as each request will try to use a separate core, but doesn't need much processing power to provide that request since the server isn't actually rendering anything.

Likewise, you shouldn't need much memory on a web host. Will scale with the number of scripts running on the host but I suspect that shouldn't be many unless you're also running moderation bots, but those should ideally be run on a different server instance.

That said, I'd also be curious to hear from other people that have experience with the fediverse though and other recommended specs to potentially host an instance.

If anyone has other questions I'm happy to try to help :)

[-] ProfessorFlaw@kbin.social 4 points 1 year ago

Where i live there is "the hoster privilige" hosters dont have to remove user content, only of somebody reports the content to you

load more comments
view more: next ›
this post was submitted on 02 Jul 2023
206 points (97.7% liked)

Selfhosted

40383 readers
660 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS