Can a System Handle Brown/Blackouts on only the GPU?
I am planning to build a multipurpose home server. It will be a NAS, virtualization host, and have the typical selfhosted services. I want all of these services to have high uptime and be protected from power surges/balckouts, so I will put my server on a UPS.
I also want to run an LLM server on this machine, so I plan to add one or more GPUs and pass them through to a VM. I do not care about high uptime on the LLM server. However, this of course means that I will need a more powerful UPS, which I do not have the space for.
My plan is to get a second power supply to power only the GPUs. I do not want to put this PSU on the UPS. I will turn on the second PSU via an Add2PSU.
In the event of a blackout, this means that the base system will get full power and the GPUs will get power via the PCIe slot, but they will lose the power from the dedicated power plug.
Obviously this will slow down or kill the LLM server, but will this have an effect on the rest of the system?
I think the safe option would be to use a smart UPS and Network UPS Tools to shutdown the LLM virtual machine when it's running on battery. I do something similar with my NAS as it's running on an older dell R510 so when the UPS goes onto battery it'll safely shut down that whole machine to extend how long my networking gear will stay powered.
I've wanted to implement something like that with my 1920R UPS for my rack but haven't found the time to commit to antiquated hardware.
Was enough of a hassle dealing with the expired SSL certs on the management card yet getting software running on one of my machines to communicate with the UPS.
All things considered my two servers chilling chew around 60w on average, not taking into account my POE cameras or other devices. The UPS should run for over a day without getting close to draining its batteries (have a half populated ebm too).
wanted to implement something like that with my 1920R UPS for my rack but haven't found the time to commit to antiquated hardware.
Was enough of a hassle dealing with the expired SSL certs on the management card yet getting software running on one of my machines to communicate with the UPS.
Honestly you should just bypass dells management software and use NUT. It supports your UPS's management card if you enable SNMP or you can bypass it all together and just run off of the USB/serial.
All things considered my two servers chilling chew around 60w on average, not taking into account my POE cameras or other devices. The UPS should run for over a day without getting close to draining its batteries (have a half populated ebm too).
I'm pretty surprised I can run my whole network for an hour off of my 1500va UPS with three switches and a handful of POE devices. I'm still thinking about replacing it with a rack mount unit so i can lock it inside my rack as I've been having issues with unauthorized people messing with it.
Yeah NUT is the package I've been looking at, and looks decently integrated into NixOS just, getting around to configuring it is another time sink.
My 1920R and an unused 15A Dell rackmount (h967n maybe) came with my rack, I've got no reason to have two UPS running nor do I want to replace the batteries in another UPS or have a 15A socket installed in the house just yet. But man it's tempting to piggy-back it off the 1920R for shits and giggles.
Waiting on some parts to arrive from AliExpress - once they arrive I'll be able to decommission one of my servers and have all my services running off one board.