Yes, I had backups everywhere. Across providers, in different countries. But I built a system tied to my AWS account number, my instances, my IDs, my workflows.
When that account went down, all those “other” backups were just dead noise encrypted forever.
Bringing them up to the story only invites the 'just use your other backups' fallback, and ignores the real fragility of centralized dependencies.
It is like this: the UK still maintains BBC Radio 4’s Analogue Emergency Broadcast—a signal so vital that if it’s cut, UK nuclear submarines and missile silos automatically trigger retaliation. No questions asked. That's how much stakes they place on a reliable signal.
If your primary analogue link fails, the world ends. That's precisely how I felt when AWS pulled my account, because I’d tied my critical system to a single point of failure. If the account was just Read only, i will waited because i could have access to my data and rotated keys.
AWS is the apex cloud provider on the planet. This isnt about redundancy or best practices.
it's about how much trust and infrastructure we willingly lend to one system.
Remember that if BBC Radio 4 signal get to fail for some reasons, the world will get nuked, only cockroaches will survive… and your RDS and EC2 billing fees.
> When that account went down, all those “other” backups were just dead noise encrypted forever.
How and why? Are you saying you had encrypted backups on other providers, but... Didn't keep the encryption keys to those backups somewhere handy and unencrypted? Why not? Can you even call that “backups”?
BTW, this is the first time in all this saga I've seen you mention other providers. Smells fishy.
> AWS is the apex cloud provider on the planet. This isnt about redundancy or best practices.
Sorry, but it is absolutely and undeniably not!
I don't think it's accurate to portray AWS as the underlying infrastructure of the internet on the planet at all.
Speaking of the "planet", here's a sample statistics for .ru ccTLD on web hosting usage across the 1 million domain names in the zone — one of the top-10 ccTLD zones on the planet:
Amazon is dead last #30 with 0.46% of the market; the data is for December 2021, so, it's before any payment issues that would only originate in March 2022, and which would take weeks/months/years to propagate.
Hetzner is far more popular, at 4.11%, also, OVH is bigger than Amazon, too, at 1.38%, and even Wix had 2.16%. Even Google managed to get 1.15%, probably because of their Google Sites service. Most of the rest of the providers are local — that's how it should be if digital sovereignty is of the essence. The real reason for the local providers, however, is most likely affordability, reliability, price, better service and far better support, and not the digital sovereignty considerations at all. This is exactly why Hetzner is at the top, because the market is obviously price-sensitive if the unlimited venture capital was never available for the local startups, and Hetzner and OVH provide the best value for money, whereas AWS, does not.
The earliest data this service currently has, is for March 2020, and there, Amazon didn't even make it in the top-30 at all!
They have a separate tab for VPS and Dedicated, which covers about 0.1 million domain names compared to 1 million in the prior view, and, there, Hetzner, Digital Ocean and OVH, are, likewise, ahead of Amazon, too, with Hetzner being #2, having over 10%, compared to Amazon's 2%, with Amazon thus far behind:
Numbers for 2025 don't look as good for any foreign provider, likely due to a combination of factors, including much stronger data sovereignty laws that may preclude most startups from being able to use foreign services like Hetzner, but Hetzner is still the most popular foreign option in both categories, still having more market share in 2025 than AWS did in 2021, and even DigitalOcean is still more popular than AWS, too.
BTW, I've tried looking up the BBC Radio 4 apocalypse claims, and I'm glad to find out that this information is false.
I'm glad you got your account restored, and I thank you for bringing the much needed attention to these verification issues, but I don't think you're making the correct conclusions here. Not your keys, not your coins. Evidently, this applies to backups, too.
Russia requires a physical presence for services offered in-country, and also Amazon won't provision certificates for .ru domains, so it's not all that surprising they don't host a very high proportion of .ru.
That's just excuses that ignore the fact that AWS was never popular even before any of these restrictions came about.
Many of the providers on those lists are older than AWS, and/or have been in business for many years since before 2010, many for 20 years or more, long before the data sovereignty concerns went mainstream all around the world in the last 10 years.
"If you want your paperwork processed in Morocco, make sure you know someone at the commune, and ideally have tea with their cousin."
Yes, it works, but it shouldn’t be the system.
What happened with AWS isn’t a clever survival tip, it’s proof that without an account manager, you are just noise in a ticket queue, unless you bring social proof or online visibility.
This should have never come down to 'who you know' or 'how loud you can go online'.
It a big luck that i'm speaking in english and have online presence, what if i was ranting in French, Arabic, or even Darija in Facebook. Tarus will have never noticed.
We seem to be saying exactly the same thing. I agree, strongly, with everything you just said here. AWS has a support problem if this was necessary, and I'm not personally prepared with an online presence if it happened to me. I'll simply be screwed and will have to recreate a new account from backups. It's something for me to think about. I can't fix AWS--I can only change what I do in response.
I recently opened a DigitalOcean account and it was locked for a few days after I had moved workloads in. They took four days to unlock the account, and for my trouble they continued to charge me for my resources during the time the account was locked when I couldn't log in to delete them. I didn't have any recourse at all. They did issue a credit because I asked nicely, but if they said no, that would have been it.
The AWS employee actually contacted me before my blog post even reached three digits in views. So no, it wasn’t PR-driven in the reactive sense.
But here’s what I learned from this experience:
If you are stuck in a room full of deaf people, stop screaming, just open the door and go find someone who can hear you.
The 20 days of pain I went through, it wasn’t because AWS couldnt fix it.
It’s because I believed that one of the 9 support agents would eventually break script and act like a human. Or that they get monitored by another team.
Turns out, that never happened.
It took someone from outside the ticketing system to actually listen and say: Wait. This makes no sense.
>So no, it wasn’t PR-driven in the reactive sense.
At my small business, we proactively monitor blogs and forums for mentions of our company name so that we can head off problems before they become big. I'm extremely confident that is what happened here.
It was PR-driven in the proactive sense. Which is still PR-driven. (which, by the way, I have no problem with! the problem is the shitty support when it isn't PR-driven)
Regardless, I 100% feel your pain with dealing with support agents that won't break script, and I am legitimately happy that you both got to reach someone that was high enough up the ladder to act human and that they were able to restore your data.
Thank you for your concern, and I appreciate the nuance in your take.
Yes, it is totally possible that AWS monitors blogs and forums for early damage control, like your company does.
But we shouldn’t paint it like I was bailed out by some algorithmic PR radar and nothing else.
Let’s not fall into the “Fuk the police” style of thinking where every action is assumed to be manipulation. Tarus didn’t reach out like a Scientology agent demanding I take the post down or warning me of consequences.
He came with empathy, internal leverage, and actually made things move.
When before i read Tarus email, i wrote in Slack to Nate Berkopec (puma maintainer): `Hi. AWS destroyed me, i'm going to take a big break .`
Then his email reset my cortisol levels to acceptable level.
Most importantly, this incident triggered a CoE (Correction of Error) process inside AWS.
That means internal systems and defaults are being reviewed, and that’s more than I expected. We’re getting a real update, that will affect cases like mine in the future.
So yeah, it may have started in the visibility layer, but what matters is that someone human got involved, and actual change is now happening.
>But we shouldn’t paint it like I was bailed out by some algorithmic PR radar and nothing else.
>[...] assumed to be manipulation
I think you're reading way more negativity into "PR" than I'm intending (which is no negativity).
It's very clear Tarus is a caring person who really did empathize with your situation and did their best to rectify the situation. It's not a bad thing that your issue may (most likely) have been brought to his attention because of "PR radar" or whatever.
The bad part, on Amazon and other similar companies, is how they typically respond when a potential PR hit isn't on the line. Which, as I'm sure you know because you experienced it prior to posting your blog, is often a brick wall.
The overwhelming issue is that you often require some sort of threat of damage to their PR to be assisted. That doesn't make the PR itself a bad thing. And that fact implies nothing about the individuals like Tarus who care. Often the lowly tier 1 support empathizes, they just aren't allowed to do anything or say anything.
It took someone from outside the ticketing system to actually listen and say: Wait. This makes no sense.
Which only happened because of your blog post. In other words, the effort to prevent bad PR led to them fixing your problem immediately, while 20 days of doing things the "right" way yielded absolutely no results.
This actually makes the problem you've described even worse: it indicates that AWS has absolutely no qualms about failing to properly support the majority of its customers.
The proper thing for them to do was not to have a human "outside the system" fix your problem. It was for them to fix the system so that the system could have fixed your problem.
That being said: Azure is so much worse than AWS. Even bad PR won't push them to fix things.
AWS Support absolutely fumbled the incident, but what you should have learned from the experience, and the majority of others commenting here is: Running a business critical workload in one AWS account is a self-inflicted single point of failure. Using separate accounts for prod/dev/test (and a break-glass account) is one of #1 security best practices:
The person paying the account is not the author, i'm.
What happen is that the person paying for the account had to settle an invoice of many thousand of dollars.
They offered me AWS gift cards,to send me electronics and they will pay for it in parts.
They lost lot of money because of crypto collapse. So i accepted their solution to pay for my OSS usage for few months.
That like if i was going to pay for your rent for 1 year. You don't pay, while i don't have to pay 3-4 years of your rent at once.
What happen, is that AWS dropped a nuclear bomb in your house, in the middle of the month .. then tell you later that it was about payment.
If they told me in the first email it was about the payer, i will have unlinked and backuped.
Yeah, it's fishy. I never claimed the Java theory is confirmed, just that it’s what an insider told me after the fact.
They said a dry-run flag was passed in the --gnu-style form, but the internal tool expected -dry, and since Java has no native CLI parser, it just ignored it and ran for real.
Supposedly the dev team was used to Python-style CLIs, and this got through without proper testing.
The idea of a rogue AWS team running a deletion script without oversight should sound ridiculous. At a company that size of AWS, you will expect guardrails, audits, approvals.
But here is the thing: no one from AWS has given me an official explanation. Not during the 20-day support hell, not after termination, not even when I asked directly: “Does my data still exist?” Just a slow drip of templated replies, evasions, and contradictions.
An AWS insider did reach out claiming it was an internal test gone wrong, triggered by a misused --dry flag and targeting At a company that size, you'd expect guardrails, audits, approvals low-activity accounts.
According to them, the team ran it without proper approval. Maybe it is true. Maybe they were trying to warn me.
Maybe its a trap to get me to throw baseless accusations and discredit myself.
I'm not presenting that theory as fact. I don’t know what happened behind the wall.
What I do know is:
- My account was terminated without following AWS’s own 90-day retention policy
- I had a valid payment method on file
- Support stonewalled every direct question for 20 days
My post is not about backup strategy, it’s about what happens when the infrastructure itself becomes hostile, and support throw you from one team to another. AWS didn't just delete files.
They gaslit me for 20 days while violating their own stated policies.
I dont disagree with your broader point—centralizing everything in one provider is a systemic risk.
The architecture was built assuming infrastructure within AWS might fail. What I didn’t plan for was the provider itself turning hostile, skipping their own retention policy, and treating verification as a deletion trigger.
> The architecture was built assuming infrastructure within AWS might fail.
From what i gather it was not. Or did you have a strategy for a 0-warning complete AWS service closure? Just imagine AWS closing their doors from one day to the next due to economic losses, or due to judicial inquiries into their illegal practices: were you really prepared for their failure?
The cloud was never data living in tiny rain droplets and swimming across the earth to our clients. The cloud was always somebody else's computer(s) that they control, and we don't. I'm sorry you learnt that lesson the hard way.
It probably wasn't even hostility; it's just that accounts is also an infrastructure component. And when that fails, everything fails. Tying everything to a single account creates a single point of failure.
It's one of the reasons I don't use my Google account for everything (another is that I don't want them to know everything about me), and I strongly dislike Google's and Microsoft's attempts to force their accounts on me for everything.
If you have very important data for you, and you don't pay very high bills to AWS, you should really have at least a cold backup somewhere else (even on your own hardware).
If you have a big cloud account, paying big money every month, at least with AWS you are in a pretty safe spot, even if people will say a different thing here.
And if you have a similar horror story with a tens/hundred of thousands of dollars (or more) monthly AWS invoice, please speak, I'm very curious to learn what happened.
I did have backups. Multi-region. Redundant. I followed AWS’s own best practices to the letter.
The only failure I didn’t plan for? AWS becoming the failure.
The provider nuking everything in violation of their own retention policies. That’s not a backup problem, that is a provider trust problem.
The reason i did not kept a local copy, was that i formatted my computer after a hardware failure, after the nurse dropped the laptop in the hospital i was on. Since i have a AWS backup, i just started with a fresh OS while waiting to get discharged to return home and redownload everything.
When i returned 6 days days later, the backup was gone.
As someone who has lost data myself, i'm really sorry this happened to you. I refrained from commenting on your article originally, but you seem stuck in a mental state blaming AWS for deleting your "backups" that you established with "best practices".
But you need to be aware that you never had backups in the way most sysadmins mean. If i need a friend to take care of a loved one while i'm away, and my backup plan is having the same person take them care of them but in a different house or with a different haircut, that's no backup plan: that's bus factor = 1.
Backups mean having a second (or third, etc) copy of your data stored with a 3rd party. Backup assumes you have an original copy of the entirety of the data to begin with.
From this point, and i'm sorry it bit you like this, but you never followed any good sysadmin practices about backups and disaster recovery. I have no idea what AWS best practices say, but trusting a single actor (whether hardware manufacturer or services provider) with all your data has always been against the 3-2-1 golden rule of backups and what happened to you was inevitable.
Blame AWS all you want, but Google does exactly the same thing all the time, deleting 15-years-old accounts with all associated data with no recourse. Some of us thought the cloud was safe and had all their "cross-region" backups burn in flames in OVH Strasbourg.
We could never trust cloud companies, and some of us never did. I never trusted AWS with my data, and i'm sorry you made that mistake, but you may also take the opportunity to learn how to handle backups properly in the future and never trust a single egg basket, or whatever metaphor is more appropriate.
Yes, I had backups everywhere. Across providers, in different countries. But I built a system tied to my AWS account number, my instances, my IDs, my workflows.
When that account went down, all those “other” backups were just dead noise encrypted forever. Bringing them up to the story only invites the 'just use your other backups' fallback, and ignores the real fragility of centralized dependencies.
It is like this: the UK still maintains BBC Radio 4’s Analogue Emergency Broadcast—a signal so vital that if it’s cut, UK nuclear submarines and missile silos automatically trigger retaliation. No questions asked. That's how much stakes they place on a reliable signal.
If your primary analogue link fails, the world ends. That's precisely how I felt when AWS pulled my account, because I’d tied my critical system to a single point of failure. If the account was just Read only, i will waited because i could have access to my data and rotated keys.
AWS is the apex cloud provider on the planet. This isnt about redundancy or best practices.
it's about how much trust and infrastructure we willingly lend to one system.
Remember that if BBC Radio 4 signal get to fail for some reasons, the world will get nuked, only cockroaches will survive… and your RDS and EC2 billing fees.