I'm actually swapping from longhorn to rook and ceph right now. Longhorn has given a lot of troubles and it does not like replica of 1 at all. Sounds like you do want a NAS, though high availability is probably over kill for home use, having a NAS makes you have that as a single point of failure. I ended up upgrading all my nodes and keeping replication off 3 since I still wanted the high availability.
I would guess it doesn’t like replica at 1 indeed.
And using a NAS would be a single point of failure indeed, but how I’m using Longhorn right now already is (my storage node goes down, my cluster would be unstable)
Longhorn is basically just acting like a fancy NFS mount in this configuration. It’s a really fancy NFS mount that will work well with kubernetes, for things like PVC resizing and snapshots, but longhorn isn’t really stretching its legs in this scenario.
I’d say leave it, because it’s already setup. And someday you might add more (non-RAID) disks to those other nodes, in which case you can set Longhorn to replicas=2 and get some better availability.
You’re on the right track here. Longhorn kind of makes RAID irrelevant, but only for data stored in Longhorn. So anything on the host disk and not a PV is at risk. I tend to use MicroOS and k3s, so I’m okay with the risk, but it’s worth considering.
For replicas, I wouldn’t jump straight to 3 and ignore 2. A lot of distributed storage systems use 3 so that they can resolve the “split brain” problem. Basically, if half the nodes can’t talk to each other, the side with quorum (2 of 3) knows that it can keep going while the side with 1 of 3 knows to stop accepting writes it can’t replicate. But Longhorn already does this in a Kubernetes native way. So it can get away with replica 2 because only one of the replicas will get the lease from the kube-api.
Longhorn isn't just about replication (which is not backup, and RAID is not backup either).
Also if you only have one replica, is it even different from local storage at this point?
You'd use longhorn to make sure applications don't choke and die when the storage they are using go down. Also, I'm not sure if you can supply longhorn storage to nodes that don't run it. I haven't tried it.
I suspect all pods that you'd define to use longhorn would only go up at the longhorn replica node.
All this is how I understand longhorn works. I haven't tried it this way, my only experience is running it on every node so if one node goes down pods can just restart on any other node and carry on.
Yes RAID is used as availability of my data here, with or without longhorn, there wouldn’t be much difference there (especially since i only use one specific node)
And you would be right, since the other nodes are unscheduled, it will be available only on my “storage node” so if this one goes down my storage goes down.
That’s why i might be overkill with longhorn, but there are functions to restore and backup to s3 for exemple that i would need to setup i guess