This has happened several times to my Pi-Hole. Even with backups, trying to get my network back online still takes too long. I haven't found a good solution for resilience yet.
Honestly something that critical probably shouldn't run on a rpi. There are plenty of cheap used thin clients you can buy on eBay that have better performance and reliability. I probably like the thinkcentre micros, but feel and hp have good options too
Pis can be supremely reliable when used correctly for the purpose. E.g. use high quality SD cards and don't write to them much, or a good quality SSD if you have to do significant writes, use an official or better PSU, etc. My oldest 4 is from 2019 and it's been in continuous use since then. It used to be a NAS running a 2-disk mirror exported over NFS. These days it's a gigabit OpenWrt router with SQM. It's still in the original SD card.
Try to use overlayfs under raspi-config, I've been running some raspberry pis for years with that (mostly on offsite locations where fixing dead sd cards is not possible)
Updating the pis is a little more work but in some use cases it's worth it
I think something like BTRFS might be a better solution as overlayfs seems to freeze the system image state. Something which is copy on write (COW) seems like it would be more resilient and still provide an RW file system. To do it right would probably be a combination of the two with the data partition BTRFS and the system image partition overlayfs.
Yeah that sounds like a good solution. I think arch based pikvm does something similar. (no reboot necessary to enable rw)
For those pis that need to write stuff, I usually mount a network drive and use that while having the overlayfs enabled. So far haven't had any issues, only one pi died after 3 years due to faulty power supply.