Skip Navigation

Update on federation issues with Lemmy.world

A while back I made a post about Lemmy.world federation and a lot has changed since then so I thought I'd do an update post.

TL;DR Lemmy.world posts, comments, votes coming to lemmy.nz have been delayed by a gradually increasing amount over recent months, peaking at about 4 days behind, but we should be back on track soon.

Background

Check the above post for some background, but TL;DR our server is in Auckland, NZ and Lemmy.world's server is in Helsinki, Finland. We are about as far apart as we can get from each other. Because Lemmy can currently only send one action at a time (Post, comment, vote), we can only accept about 4 or 5 actions per second as it takes around 1/5 of a second to make the round trip. Lemmy.world is now creating more than this on average, which means we have been falling behind more and more.

Pre-fetcher

After that post, things continued to get worse. While hanging out on Lemmy Matrix rooms discussing the problem, someone (I'm not actually sure of their Lemmy account) offered to set up a pre-fetcher. Roughly how it works is it monitors lemmy.world for new posts and comments, then sends a request to lemmy.nz to see that content. Lemmy.nz then requests it from lemmy.world because it doesn't exist.

This helps because when lemmy.world sends lemmy.nz a post or comment, lemmy.nz then needs to make other requests. e.g. it might not know about the user, so it needs to request the user from their home instance, it may need to generate a thumbnail, etc. By pre-fetching the posts, this means when lemmy.world sends it's normal outbound federation lemmy.nz already has it so can move straight on to the next one. We can't prefetch everything, notably votes, so for a long time we have had lemmy.world posts show with zero votes until the federation activities start coming through. Unfortunately comments from lemmy.world users on lemmy.nz posts can't be pre-fetched so they were still taking a long time to come through.

Prior to this pre-fetcher being turned on, Lemmy.nz was doing a lot worse than aussie.zone (as seen in the above post). After turning it on, within not too long we were in better shape than aussie.zone, but unfortunately we were both still getting worse.

This is the state we have been in until yesterday. Gradually things were getting worse and worse until we were at about 4 days behind lemmy.world, so if a lemmy.world user posted on one of our posts then it took 4 days to show up (you might have noticed this if you got a notification of a reply to your post or comment that then said it was from days ago).

Batcher

The same user who created the pre-fetcher was also working on a batching process. The basic idea was that instead of lemmy.world sending each item halfway across the world, instead you add an extra server that is hosted close to lemmy.world. Lemmy.world sends their federation items to that server, then that server collects them up into one batch, which gets sent to some software running on the lemmy.nz server. That software then unbundles them into separate pieces again then feeds them into lemmy.

The idea here is that you greatly reduce the lag. Lemmy.world gets a very quick response from the extra server and so can send the next activity almost straight away. The software that passes it to lemmy.nz is on the same server as lemmy.nz so communication is very quick. And collecting up the items into a batch for the trip across the world saves a lot of time in back and forths, so we can keep our goal of receiving things in the correct order while also not having to send one at a time. Receiving in the correct order is important, for example, if you accedentally downvoted then quickly changed it to an upvote, you wouldn't want another instance to receive the upvote first and then the downvote as it would show you downvoted instead of upvoted.

This batcher I have set up (with a lot of help!) over the weekend, and turned on yesterday morning once testing had been completed and I could get lemmy.world to redirect their lemmy.nz traffic to this new server (this was done through a change in something called the hosts file, long story short it tells your server "ignore what anyone says, lemmy.nz is actually over here").

In the last 24 hours or so we have got from 1.5 million activities behind to about 970k activities behind. This puts us at about 2.3 days behind now, a huge improvement!

Pictures

Graph of activities behind lemmy.world

Here we have lemmy.nz in yellow and aussie.zone in green. The hump is from a large number of activities generated on the lemmy.world side, they didn't need federating but the way this is measured means they show up until it's worked out that they aren't needed.

You can see a sharp fall after the batcher was turned on yesterday.

Graph showing aussie.zone gradually increasing from 700k 30 days ago reaching 2.6 million activities behind lemmy.world, with lemmy.nz starting at 600k behind, reaching 1.5 million, then dropping sharply over the last day to about 970k behind

Graph of time behind lemmy.world

Same colours, lemmy.nz in yellow underneath and aussie.zone in green on top. This one shows how long the delay is, or more accurately it looks at the last activity that was received from lemmy.world and checks what time that activity actually happened. So if the last activity was a comment from 4 days ago, it shows 4 days here.

Graph showing a similar shape to last one, starting at around 1.7 days behing for aussie.zone and growing to over 6 days behind. Lemmy.nz starts at around 1.3 days behind, grows to about 4 days behind at the peak about 24 hours ago, then starts dropping sharply down to about 2.4 days behind currently

Conclusion

So that's a breakdown of everything that has happened the last few months, hopefully this new batching process will bring us back in line with lemmy.world. If you see anything weird happening, please let me know!

Also this is a shout out to all the people who made this happen, and who are building all sorts of tools that we use. We have a selection of different front-end websites you can access lemmy through, an automod, prefetcher, batcher, and all sorts of help from others! I definitely couldn't do this stuff without help 😆

And as always, if you have any questions or want more detail on any of this, feel free to ask!

Edit:

We are now up to date! Yay!

graph as above but now showing sharp drop in activities behind in recent days right down to 0 graph as above but now showing sharp drop in time behind in recent days right down to 0

0
0 comments