Google reveals struggle to balance HDD and SSD use at scale
Google reveals struggle to balance HDD and SSD use at scale
Google reveals struggle to balance HDD and SSD use at scale
Storage tiering between HDD and SSD has always been a nightmare and generally worth just avoiding.
I'm looking forward to when SSDs aren't that much more expensive than HDDs for the same capacity. Seems HDDs have been holding their own better than I thought as of late.
Agreed. I've been planning on building a home NAS and really wish SSDs were already there.
There are fundamental problems with how SSDs work. Large-capacity flash might soon become a thing in servers but there won't be any cost effective SSDs in the consumer market for at least 10 years.
The problem is how operating systems access the data: they assume a page-organized sequential disk and access the data in that way. SSD controllers essentially need to translate that to how they work internally (which is completely diffetent). This causes latency and extreme fragmenentation on large SSDs over time.
Instead of buying a 20TB SSD you're much better buying 4 5TB HDDs. You'll probably get better write and read speeds if configured in a Raid0 in the long run. Plus, it's a lot cheaper. Large SSDs in the consumer market are possible, the just don't make any sense for performance and cost reasons.
It's about 5:1 cost ratio these days, it's honestly pretty worthwhile to just go all nvme these days when you consider the reliability, performance and noise benefits. A raid 5 of nvme can be cheaper and faster than a raid 1 of hdds.
I don't think I'm adding any more hard drives to my home ceph array at this point.
Tiering is a fundamental concept in HPC. We tier everything starting from registers, over L1-L2 Cache, Numa-shared L3, memory, SSD Cache. It makes only sense to add HDD to the list as long as it's cost effective.
There is nothing wrong with tiering in HPC. In fact, it's the best way to make your service cost effective while not compromising on end-user performance.