Lemmy.ca's Main Community @lemmy.ca m-p{3} @lemmy.ca 4 mo. ago

About the outages (July 1st and 2nd 2024)

Happy Canada Day everyone.

Related to the outage that happened last night, we rebooted the Lemmy services but we're still trying to figure out the root cause, which seems to point to an out of memory issue in the logs. However it's not what we see in our monitoring console.

In the meantime, we will monitor the service more closely until we are confident the issue is resolved, and we will improve our tools to detect such a problem faster.

EDIT: Also happened at night on July 2nd, still trying to find the root cause..

Apologies for the extended downtime.

You're viewing a single thread.

15 comments

Hello,

Good luck with the troubleshooting!

As I suggested elsewhere, could you maybe setup a "status" community on another instance (e.g. sh.itjust.works, it's Canadian as well), so that people can go there to see updates about the potential outages?
- We have https://status.lemmy.ca already in place, we'll try to keep it updated as much as possible.
  
  Indeed, but I was more talking about a Lemmy community where people would be able to discuss and give each other information.
  
  I actually stumbled upon someone asking a question on Reddit, it could have been interesting to have a place to redirect this person to: https://old.reddit.com/r/Lemmy/comments/1dtj4hc/can_anyone_help/
  
  The alternative is a Matrix room, but it might take some time to set up compared to just a community on another instance
  
  I'm looking at making a custom CloudFlare error page that embeds the status page. At least we'll be able to put some communication there when something happens without people having to guess where to go.
  
  Sounds great, thanks!

You've viewed 15 comments.