89
submitted 1 year ago by Machefi@lemm.ee to c/fediverse@lemmy.world

I've been on Lemmy for some time now and it's time for me to finally understand how Federation works. I have general idea and I have accounts on three federated instances, but I need some details.

Let Alpha, Beta, Gamma and Delta be four federated instances. I have an account on Alpha and create a post in a community on Beta. A persoson from Gamma comments on it and a person from Delta upvotes the post and the comment.

The question: On which instances are the post, the comment and the upvotes stored?

all 50 comments
sorted by: hot top controversial new old
[-] Hank@kbin.social 94 points 1 year ago

Federated content is stored in the balls.

[-] QuazarOmega@lemy.lol 16 points 1 year ago

Thank god someone made the joke already!
I wouldn't have had the federated content holders

[-] Unsustainable@lemmy.today 8 points 1 year ago
[-] WhiteRice@lemmy.world 20 points 1 year ago
[-] DmMacniel@feddit.de 4 points 1 year ago
[-] rmuk@feddit.uk 2 points 1 year ago

Poor Ligma. Worst case of Dee's syndrome I've ever seen.

[-] Encode1307@lemm.ee 9 points 1 year ago

It's decentralized, so across everyone's balls

[-] Balls@lemmy.world 7 points 1 year ago

I can confirm this to be true

[-] Rottcodd@kbin.social 44 points 1 year ago

As already noted, on all of them.

The easy way to grasp how it works:

When you, on instance.alpha, view a community on instance.beta, you aren't actually on community@instance.beta. You're actually on an entirely separate copy - community@instance.beta@instance.alpha. That's the community you're reading and posting to and upvoting/downvoting in. Meanwhile, people on other instances are each on their own locally hosted copies of the same community.

The lemmy software (or kbin or mastodon or whatever) then periodically syncs up all the local copies of community@instance.beta, so you all end up looking at (more or less) the same content, even though it's actually a bunch of technically separate communities.

[-] MagneticFusion@lemm.ee 16 points 1 year ago

Doesn't that technically make the Fediverse inefficient?

[-] Rottcodd@kbin.social 53 points 1 year ago

It's less efficient than a centralized forum would be, but efficiency isn't the only or even the highest priority. Decentralization is the explicit point of the fediverse, and to the degree that that requires sacrificing some measure of efficiency, that's just the way it goes.

The goal was to build a system that would be robust and relatively seamless while remaining decentralized. That's more or less what they've done. There's a fair amount of fine tuning and tweaking left to be done, and actively being done, but the basic system is what it is because it best balances all of the goals.

[-] Valmond@lemmy.mindoki.com 9 points 1 year ago

Also, all data isn't stored on every Lemmy I stance, only (or mostly I guess) where it is relevant.

[-] ShittyKopper@lemmy.blahaj.zone 13 points 1 year ago

As long as 1 person from an instance is subscribed to a community, that person's instance will fetch everything that happens inside that community (and keep storing it even if they later unsubscribe, unless manually purged by an instance admin)

[-] Valmond@lemmy.mindoki.com 8 points 1 year ago

So not everything everywhere but quite not that efficient ...

Thanks for the explanation.

[-] Rottcodd@kbin.social 2 points 1 year ago

Right, which on a side note is most of why I have accounts on a number of different instances and regularly switch between them - because each instance is at least subtly different, since they each have different userbases, and thus somewhat different sets of subscribed and thus federated communities.

[-] nils@feddit.de 2 points 1 year ago

But if you on instance.alpha subscribe to a community on instance.beta that would federate the community to your local instance, right? Is there something I'm missing?

[-] Rottcodd@kbin.social 2 points 1 year ago

Right, but of course if you don't subscribe to it (and nobody else does) then it doesn't.

So, for instance, if you go in through an account on a narrowly specialized instance, you're potentially not going to see a lot of the communities from other instances at all, even on their All, just because nobody's bothered to subscribe to them. And you'll likely see highly specialized communities that fit well with that instance that you might not see anywhere else.

The smaller the instance is, the more likely that is.

I have accounts on a couple of small instances on which I haven't even bothered to subscribe to anything, since their All already matches what I want frim the instsnce.

[-] MrSilkworm@lemmy.world 3 points 1 year ago

that's possibly the best explanation of the difference between a corporate social or other media and a decentralised open source one.

[-] Falmarri@lemmy.world 21 points 1 year ago

It depends what you mean by inefficient. It's very efficient if you're optimizing for robustness and control of data.

[-] Corgana@startrek.website 12 points 1 year ago

It's also very efficient if you're optimizing for having an actual fucking conversation without algorithms or ads.

[-] brcl@artemis.camp 3 points 1 year ago

We’ve been trying to reach you about your cars extended warranty…

[-] lemmyingly@lemm.ee 1 points 1 year ago

Control or redundancy of data?

If there is a federated delete, I get the impression there is no actual way to ensure that data is deleted?

[-] Falmarri@lemmy.world 1 points 1 year ago

I don't mean control from the perspective of someone posting data. I mean from the perspective of the server owner.

[-] lemmyingly@lemm.ee 1 points 1 year ago

Sorry, what do you mean?

[-] r00ty@kbin.life 10 points 1 year ago

Pretty good answer but there's no periodic sync. From the moment a community is subscribed to, the instance that is home to the community will send all activities in that community to the subscribed instances as they happen.

That's why you don't see old content all being synced. Just new content (and some old content if it is liked or replied to after subscribed)

[-] skankhunt42@lemmy.ca 3 points 1 year ago

Is there a single source of truth? It really sounds like split brain is possible?

All instances may have their own copy but I imagine the community the instance was posted on is important and need to be up?

[-] r00ty@kbin.life 4 points 1 year ago* (last edited 1 year ago)

Well, the answer is "it depends"

For the community as a whole, I would say that the instance that hosts the community must be up to federate any new posts to other instances. Because it works a bit like:

Instance A hosts Community 01.
Instance B user posts to Community 01.
Instance B federates the post to Instance A
Instance A federates the post to Instances C, and D.

So, if instance A is down, the post will exist only on instance B.

But, federating the posts and comments themselves is not the only way an instance will get posts and comments. Consider the following situation. The post above exists on instances A-D. But after it is posted, Instance E subscribes to the community. Instance E will not have the above post. They will only start getting new federation events.

However, say for example someone on instance C likes the post? The like event will be sent to Instance E. Instance E will see the like, try to find the post (the post/comment URL is included in the like event) and fail. So, it will then look up the original post. Here's where it gets interesting. That URL will not be on Instance A where the community lives, but on Instance B where it was posted. So, in this case, if Instance B is down, Instance E will not be able to fetch the post.

However, if all the instances are up, Instance E will get the post add the like and add to database. This is why when subscribing to instances you will get some old content appear but not all. Because if the old content is interacted with, it will be fetched to render the interactions.

This understanding is based on my understanding of kbin federation. But, I would be very surprised if lemmy did not work the same.

EDIT:

To be clear, to see what already is federated no other instances except the one you're visiting need to be up. For federation of live events happening to a community, the instance hosting the community must be up and to fetch content needed for a federation event (for which the referenced object was not received via federation), the instance the content was created must be up.

[-] skankhunt42@lemmy.ca 1 points 1 year ago

Very well written! Seems easy enough, thank you!

[-] strepto@kbin.social 26 points 1 year ago

It's stored on all 4.

Regardless of which on you create the content on, assuming they all federated with each other correctly, every instance hosts its own copy of your posts.

[-] sse450@lemm.ee 4 points 1 year ago

So, data is not normalized. Isn't it a waste of storage? Same data on all instances.

[-] maniacal_gaff@lemmy.world 9 points 1 year ago* (last edited 1 year ago)

It isn't a waste if it provides redundancy and prevents one server from being in control of all data.

What do you mean by "normalized?"

[-] sse450@lemm.ee 4 points 1 year ago

I see your point.

I used the term "normalized" in the context of databases. One piece of data should exists only once. But, this contradicts your points of redundancy and control.

[-] bjornp_@lemm.ee 3 points 1 year ago

That isn't what normalized means in the context of databases.

Also databases store the same data many times over often. For redundancy and load-balancing purposes. Really, federation just takes care of replication somewhat.

[-] maniacal_gaff@lemmy.world 2 points 1 year ago

Ok, I come from the signal processing world where that means something very different.

[-] killeronthecorner@lemmy.world 2 points 1 year ago

That's not what "normalized" normalisation means in the context of databases.

[-] AbsolutelyNotABot@lemm.ee 1 points 1 year ago* (last edited 1 year ago)

The only objection I have with that is redundancy is useless because if the main server who "host" the community goes down then all the other copies will die too as content can't be added anymore.

There's no mechanic for orphan communities

[-] qaz@lemmy.world 3 points 1 year ago* (last edited 1 year ago)

It's mostly intended for caching content to speed up load times afaik

[-] ShittyKopper@lemmy.blahaj.zone 21 points 1 year ago* (last edited 1 year ago)

All of them. If you can see it from an instance, it's stored in that instance.

The only exception are images which may or may not be stored depending on the exact backend software and configuration.

Both "alpha" and "beta" has authority to hide the post (one hosts your account and the other hosts the community) from the rest of the federation. Similarly, both "beta" and "gamma" have the authority to hide the comment from the federation. That said, instances can also individually hide/purge stuff from their own views without affecting the wider federation if they so choose (which is how things like .world's blocking of piracy communities work)

"beta" handles distribution/"boosting" (in masto speak) of the post and comment to other instances (however "gamma" will send it to both "alpha" and "beta" as it's a reply to "alpha"). AFAIK "alpha" and "gamma" handle the boosting of the upvotes they receive from "delta" (though I could be wrong on that part).

Oh, and "boosting" doesn't mean "i got 1 new upvote on this comment :3" it means "delta has sent me this exact Like event owned by person@delta associated to comment@gamma (and a lot of other data)". There are also keys and signatures involved to make things a bit harder to spoof.

[-] TheHottub@lemmy.world 7 points 1 year ago

Nice try, Spez! Get outta here! Go on, get!

[-] Ducks@ducks.dev 6 points 1 year ago

Both. The text data is in the database of all instances that are federated. Your account credentials are only stored on the instance you're registered to.

[-] s4if@lemmy.my.id 5 points 1 year ago

On all instances. Each instance has copy of what happened and every action is relayed by community instance (in this case, Beta) to all subscriber of the community.

[-] drekly@lemmy.world 4 points 1 year ago

So if I make an instance, I can scrape all the content and gather data on every user and sell it to Cambridge analytica?

[-] s4if@lemmy.my.id 3 points 1 year ago

Technically, yes. But if you are caught red handed, be ready for the mass ban to your account/instance.

[-] drekly@lemmy.world 3 points 1 year ago

But if it's just sat there silently data gathering, nobody would know?

[-] s4if@lemmy.my.id 2 points 1 year ago

Yup, that is what professional/corporate scrapper do from the very beginning. Fediverse has poor privacy, and it is designet that way. It is better to share only safe content to fediverse for our safety, and share more private things on messaging/chatting apps like matrix or email only to person we trust.

[-] r00ty@kbin.life 1 points 1 year ago

What info do you think they will get? The only info is what you put in the public info on the user profile on your instance. So they can get your username (well user@instance), avatar, about info. That's about it. Anything else like email address and password hashes are only stored on the instance you signed up to.

[-] drekly@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

Every single comment and post anyone ever names and anything they are subscribed to. Run everyone's content through some sentiment analysis, and now you have a great set of users, emails and their commonly used IP, grouped into interests, general mood, and political leanings, perfect for advertising.

[-] r00ty@kbin.life 1 points 1 year ago

Where are you getting email and ip from?

[-] killeronthecorner@lemmy.world 1 points 1 year ago

What's my name?

this post was submitted on 20 Aug 2023
89 points (96.8% liked)

Fediverse

28397 readers
238 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration), Search Lemmy

founded 1 year ago
MODERATORS