Sure! First up, we don't do repairs every time one host goes down. Standard prac...

jstummbillig · on March 7, 2020

So, is this it? Dropbox designer challenges the project broadly, gets top comment, Author refutes – and we leave it at that?

I mean this is basically the moment where I would expect every systems designer on HN coming out of the woods and crushing Sia into the ground, if there were, in fact, any ground at all to crush Sia into.

Is this actually legit? If so, where is the rejoicing? What am I missing?

zzzcpan · on March 7, 2020

No, the author's idea is ok and he's mostly right on the cost part. Configuration and the numbers are a bit off and unrealistic, you won't get such low 95% availability per site due to other economical and technological constraints, you'll get at least 99%, but probably closer to three nines per site and 64 out of 96 won't be necessary at all (something like 8 out of 12 could be enough). Dropbox designer is just ignorant, biased and conditioned to US market and environment, but appeals to authority, so people upvote his bad comment. I do storage too, on smaller scale than Dropbox of course, not in the US, but it is distributed and the cost is already lower than what you see in the title.

Ajedi32 · on March 8, 2020

The fact that nobody is able to prove it's a bad idea doesn't necessarily mean it's a good one. There might still be other downsides that haven't been considered, some of which could be solveable with more development work and some not.

At this point the cautious skeptic will be thinking "hmm, maybe there's something to this", not necessarily full on rejoicing.

That said, I agree it does seem promising. If you ever find yourself in need of cheap cloud storage it wouldn't hurt to look into Sia as a possible option.

fragmede · on March 7, 2020

Dropbox designer name dropped "64-of-96 RS encoding" as if they're the only person that's heard of, or dealt with Reed-solomon encoding before, and expected the author to get scared off. There is, in the case of drop box, plenty of ground to crush Sia into. That is the ground between the 95% and multiple-nines of availability.

Engineering is about tradeoffs. I could build a network as good as Google's with infinite money, infinite time, and infinite help. I could design a product as beautify as Apple's with the same lack of limitations. Unfortunately for me, I have limited money, limited time, and limited help. Every systems designer understands that, innately, so isn't rushing out of the woodword because Sia and Dropbox have merely chosen different tradeoffs. That one has IPO'd is uninteresting in the abstract. It's just money after all.

tuananh · on March 9, 2020

no. the author mentions "64-of-96" things in the post as well. I don't think kmod means to do what you said.

Legogris · on March 6, 2020

That sounds incredibly energy-inefficient. On average you have 12.5% of servers running but not contributing and possibly incurring load on other nodes.

H8crilA · on March 7, 2020

12.5% overhead isn't that much. It's what just the networking gear can easily eat in a data center (12% out of all the non-cooling-related power supply).

Reed-Solomon encoding adds 50%, of you want 3 block per 2 data blocks. Replicated encoding (not relevant here since this is allow throughput usecase, but necessary if you want to sustain high read throughput) is adding at least 200% (if you want a 3x replication, which I think should be the minimum).

Legogris · on March 9, 2020

12.5% is far from the total overhead, just an additional compounding factor.

monocasa · on March 6, 2020

Turn 'em off if they're not contributing.

rasz · on March 7, 2020

Spinning iron doesnt like start/stop cycles. Server drives go thru very few in their entire life for that very reason.

ec109685 · on March 7, 2020

These are ssd’s.

majdnemkocka · on March 7, 2020

They are not, see the link in the article: https://www.amazon.com/HGST-7-2K-SATA-Drive-Model/dp/B07XQRB...

ec109685 · on March 8, 2020

Oops. Thanks.

dahfizz · on March 7, 2020

87.5% efficient is not incredibly inefficient.

manigandham · on March 7, 2020

Compared to what though? How efficient is a typical data center? Probably way less than this.