How is there a 3 day delay in federation?
I am curious how this happens from a technical perspective.
To discuss how to grow and manage communities / magazines on Lemmy, Mbin, Piefed and Sublinks
Resources:
How is there a 3 day delay in federation?
I am curious how this happens from a technical perspective.
Basically, the way Lemmy is designed, each instance has to tell each other instance what its users did (where relevant—no need to send to aussie.zone a post made in a community that there are 0 aussie.zone subscribers, for example). That includes posts, comments, and upvotes. And the way it's designed, the originating server (in this case, LW) has to send it to the receiving server (AZ), then the receiving server sends a confirmation back, and then the originating server can send the next one.
Because LW is hosted in Germany, and AZ in Australia, there's a minimum amount of time thanks to the physical constraints of sending signals over that long distance. And double that because it's a return trip, and a small amount more for processing time. It ends up measuring in the hundreds of milliseconds. Which leaves you with a maximum of a few hundred thousand actions sent from LW to AZ per day. If LW users are doing more than this, then the delay will slowly grow. If they send less, the delay will shrink, or remain at near 0.
Now, the most recent version of Lemmy actually lets you set it so that instead of sending just one at a time, you can have multiple threads, so you're sending multiple at a time. But LW only upgraded to this version a few days ago, and they didn't turn on this feature when they did so.
Wait, every single action is sent individually and the next action is sent upon confirmation of successful delivery?
This is is wild.
This is is wild.
The wildest thing is that it has been like this for a while: https://lemmy.world/post/20575394
I guess it's good that this issue is in the process of being resolved while the network is small and primarily consists of technically minded users.
I mean, it's not being resolved, since the issue is that LW is too big to effectively federate, and LW is refusing to take the steps to improve the situation. Weirdly enough, this is less of an issue if the network scales horizontally, with a large number of small nodes.
But also, this is a symptom of the current attitude of "it shouldn't matter where the community is hosted". The fediverse is a simulacrum of centralized social media, and a poor one at that. The more we try and beat it into that shape, the more it's going to get all weird on us.
Like, a significant issue here is the insistence people have had that up/down-votes be synchronized. People want to know what the global passive-aggressive opinion on a post or comment is, rather than the local one, which requires every single button press to be sent to each and every subscribing website. And people expect stuff to be sent out as a live stream, rather than being held back for batching, too.
There's a significant cultural issue to be sorted out here. Better mechanical features aren't going to solve it in the long run.
Like, a significant issue here is the insistence people have had that up/down-votes be synchronized. People want to know what the global passive-aggressive opinion on a post or comment is, rather than the local one, which requires every single button press to be sent to each and every subscribing website. And people expect stuff to be sent out as a live stream, rather than being held back for batching, too.
I figure this could be reduced a lot if even just 1 minute worth of votes were batched together, although I don't think the ActivityPub standard technically includes batched activities currently
LW is too big to effectively federate
LW is like ~35% of Lemmy's 55K MAU, right? So around ~19K MAU. For the Threadiverse to be viable on a mass scale, support for an instance with ~20K MAUs is a must. I would argue it's a must for an instance with 2 million MAUs.
This includes graceful support for up/down votes.
The issue is more about the spread of active communities and their centralization on LW.
LW comments make it just fine to Aussie.zone on this community, as it's on Lemm.ee: https://aussie.zone/post/18681158/15483480
When you look at the most active communities on the platform, the vast majority of them are on LW: https://lemmyverse.net/communities?order=active
I get that, I just don't think at this particular point in time (with 55K MAU) it is viable to focus on distribution of communities.
It's not like we have 10 million MAUs and we are seeing too much centralization on LW.
Let's get to at least say 500 K MAU stable and then focus on decentralization.
On the other hand, it's probably easier to decentralize while we have 55K MAU rather than 500K
That's also true, can't argue with that.
Wait why didn’t they turn it on? When they upgraded I assumed this issue would be fixed.
Afaik only one item (post/comment) is synced at a time. So if there is a lot of latency between the instances, new content is produced faster than can be synced. But I'm sure someone will link the GH issue if I'm wrong
if the requests are all serialized, if the ping time is like 300ms, and if each request takes like 100ms of CPU time
that means you only need 648,000 actions in queue to equal 3 days
when you consider that even upvotes/downvotes of posts/comments count as actions, I could see it happening
but the queue isn't completely serialized anymore, so maybe this number is still a bit unbelievable (EDIT: seems like LW has not yet enabled the feature for parallel sending)
A single action cannot take 100 ms of CPU time. This does not sound realistic.
And why is this with aussie.zone only? This would be all instances, no?
Recording a vote from an already-known account is way less than 100 ms, yes. Usually it's less than 100 ms but sometimes it can be several seconds.
When an activity is received the cryptographic signature on it needs to be checked and that means sending a network request to the creator's instance to retrieve the creator's public key (and profile pic, and cover pic each of which are more network requests. Then resize and store those images).
Images in posts need to be downloaded, resized, scanned for objectionable material, etc.
Every network request is quite unpredictable as many instances are overloaded or poorly configured.
Aussie.zone might be the most extreme case, since its physical location is the farthest away from the largest Lemmy instance
@[email protected] @[email protected] @[email protected] , we're trying our best to use non-LW communities
What does this mean? Does it mean that a post from LW needs 2.98 days to appear on aussie.zone ?
i don't know if there's any prioritising of types of actions, eg. posts are given higher priority than comments which is higher than up/down votes but when I made posts on lemmy.world I'd get an initial wave of upvotes from other instances and then like 12 hours later a lot of upvotes from lemmy.world users
So like time capsules? I wrote this comment in my sunny office on Friday the 21st of April at 16:35 CEST. Spring is coming and the long awaited sun is lovely.
I hope you in the future have had a wonderful weekend.
No, your comment went to Aussie.zone directly as this is a lemm.ee community: https://aussie.zone/post/18681158/15483480
Aw, damn!
Yup, exactly so. I'll often notice a reply to one of my comments from a LW user will only show up in my inbox about 3 days later.
And you know what? I don't feel like I'm missing anything. Lots of posts and comments to keep me busy for hours.
Ironically, the whole Lemmy activity makes it to you, just three days later
How is that ironic?
maybe not irony but it is funny op said i don’t feel like i’m missing anything! and it turns out they’re not it’s just delayed
True! Just outdated
Is this how long it takes for posts to federate? Why would it be so long?
I thought after lw upgraded the delays would go down :|
It should have. The backend added parallel syncing since a few versions back. By all accounts it should be going down.
The final bullet point of https://lemmy.world/post/23471887 suggests that that feature may not have been enabled yet.
That would be very unresonable imho. It's the most important feature for them.
Perhaps they wanted to assess as few changes as possible simultaneously before turning on any optional features. I would imagine they will get to it fairly quickly.
Parallel sending of federated activities to other instances. This can be especially useful for instances on the other side of the world, where latency introduces serious bottlenecks when only sending one activity at a time. A few instances have already been using intermediate software to batch activities together, which is not standard ActivityPub behavior, but it allows them to eliminate most of the delays introduced by latency. This mostly affects instances in Australia and New Zealand, but we’ve also seen federation delays with instances in US from time to time. This will likely not be enabled immediately after the upgrade, but we’re planning to enable this shortly after.
These are great charts, are these from the standard Prometheus endpoints in Lemmy? What queries do you use?
they're using my federation exporter as data source.
it's just scraping the federation api every 5 min for each included instance and then visualizing that.
Thank you! I'm going to set this up for my instance!
Any idea how aussie.zone is hosted? Syncing can become bottlenecked as instances grow and discover more communities. If they're using a remote host they may need to upgrade or adjust their setup to hold sync with the larger instances.