For certain workloads these storage pods are much, much cheaper than S3. Anything where you are storing files that rapidly become stale, but still need to be instantly accessible for the rare random request.
I've only looked at it from the perspective of video files, though. Where I work we add a gigabyte of data per user per month. Eventually our S3 storage bill is going to be our largest cost due to compounding growth.
Couldn't you save a ton of money by explicitly moving old data to 'cold storage,' and keeping it on Glacier? Users would probably understand if data older than a year takes a few minutes to be retrieved.
S3's number is just marketing. It's going to be that durable until a black swan event wipes out 0.1% of all files. I imagine they keep multiple copies of every file, spread across multiple data centers. It's durable against hardware failure of all varieties, but it's still vulnerable to a software bug. It's also possibly vulnerable to certain types of natural disasters.
IIRC backblaze employs a similar strategy. They have the advantage that your home computer is still storing a copy, so even if they have a minor catastrophe they can recover by simply having their client re-upload those files.
Human disasters too. I'd wager the odds of a catastrophic massive nuclear war are higher than one in ten million per year. Not that I expect Amazon (or much of anyone) to protect against that, nor would the integrity of my S3 data be a priority afterwards, but it makes the claim kind of absurd.
Edit: reading more closely, their durability number is per object. It's one object lost per 100 billion objects per year. The 10 million number comes from a hypothetical situation where you store 10,000 objects.
I don't know how many objects S3 stores altogether, but if we say it's a billion (presumably a vast underestimate) then that would imply that the probability of losing all objects in the system over the coming year is one in 100 quadrillion. I don't think this planet is that safe.
As great as S3 is, it's still confined to Earth. Seems to me that the odds of a planet-wide disaster taking out all Amazon infrastructure (as well as certain less important things like human civilization) are higher than the odds they're giving of a catastrophic S3 failure.
Sure. The AFR of your Western Digital desktop HD doesn't factor in and lower it's reliability slightly because of the slim chance that a 747 is going to crash into your house and destroy your home.
Barring unforeseen acts of God, the 9's listed above apply and you just have to personally weigh if you think the risk of S3 losing multiple datacenters is high enough for you to risk storing your data there.
Amazon pretty explicitly includes unforeseen catastrophic events in their durability estimate. "In addition, Amazon S3 is designed to sustain the concurrent loss of data in two facilities." I sure hope the loss of two facilities doesn't fall into the "foreseen" category!
Sure, it says they account for that right there in their FAQ, so I guess I don't understand your point.
If you think events like the world being destroyed by a meteorite, the Sun dying, or a zombie apocalypse should factor in to their 9's reliability percentage, it shouldn't.
Serious question, here. Things like gigantic hurricanes flooding their data centers should factor into it, right? Risk of war destroying the data center should factor into it, right? (I mean, would you trust S3 to the same degree if all of their data centers were located in Gaza?) So why shouldn't a scenario like "all of our data centers are simultaneously destroyed as part of a worldwide nuclear conflict" factor into it?
Extreme weather events I'm sure are calculated into their factors based on location. IE: No hurricanes are going to happen in Indiana, but how are you going to predict a worldwide nuclear conflict?
Should your house insurance be higher because the world might be destroyed tomorrow by aliens? Something like this isn't quantifiable and if it happens you have way bigger things to worry about than your mp3's in S3, so minuscule events like this aren't relevant in the grand scheme of things.
My house insurance calls out certain extreme circumstances as being ineligible for coverage. Yes, including nuclear war.
I agree, it's not really quantifiable. However, Amazon lists their durability to a number of significant figures that implies they are able to quantify the risk down to that level. Yet these unquantifiable risks give every appearance of being considerably larger than Amazon's figure.
Does Amazon's figure come with a "excluding loss due to ..." clause? If so, what do they exclude?
> The Service Commitment does not apply to any unavailability, suspension or termination of Amazon S3, or any other Amazon S3 performance issues: (i) that result from a suspension described in Section 6.1 of the AWS Agreement; (ii) caused by factors outside of our reasonable control, including any force majeure event or Internet access or related problems beyond the demarcation point of Amazon S3; (iii) that result from any actions or inactions of you or any third party; (iv) that result from your equipment, software or other technology and/or third party equipment, software or other technology (other than third party equipment within our direct control); or (v) arising from our suspension and termination of your right to use Amazon S3 in accordance with the AWS Agreement (collectively, the “Amazon S3 SLA Exclusions”). If availability is impacted by factors other than those used in our calculation of the Error Rate, then we may issue a Service Credit considering such factors at our discretion.
I think the S3 number is a floor, not a ceiling. That is, you should expect to be losing that much of data, but, given an unplanned event, obviously a great deal more data can be lost.
Amazon is no where near that. Software glitches have already distroyed s3 data. More importantly anything over 99.999% has to take into account vary low probability events like nuclear war.
I've only looked at it from the perspective of video files, though. Where I work we add a gigabyte of data per user per month. Eventually our S3 storage bill is going to be our largest cost due to compounding growth.