> Selectively restoring data only for certain rows is super hard. What's the rig...

inopinatus · on April 13, 2022

in theory, shard your customer databases 1:1, job done. alas, in practice, many SaaS compromise this two ways:

a) overwhelmed by creeping featuritis, each customer's data has relationships to global tables, and

b) they backup their entire database cluster in one snapshot

and there maybe other gotchas for restoration, like relying on denormalized views and caches that have to be rebuilt. they may also have erroneously assumed that data protection's main value driver is whole-of-system disaster recovery, which can lead to pathologies such as "we don't have a single-customer restoration tool".

this is not a niche scenario

bpicolo · on April 13, 2022

Heck, it's worse now - if your data deletion tooling did a good job, there are dozens or hundreds of microservice databases to restore.

seanwilson · on April 13, 2022

> shard your customer databases 1:1

What are the downsides to this?

inopinatus · on April 13, 2022

* makes it much harder to distribute your tables by any other factor, for whatever reason (usually performance, sometimes archival)

* disaggregates data that the SaaS might be interested in querying/updating as an aggregate

* not all ORM frameworks handle this case well, if at all

* dumps are more than a single trivial command

basically all your data operations gain an additional dimension of complexity, and you may not perceive the benefits until much later

deckard1 · on April 13, 2022

> not all ORM frameworks handle this case well, if at all

typically this is probably for internal reporting/metrics. But yeah, a custom script with direct SQL is in order. Personally my opinion is avoid ORM at all costs. Never seen a benefit that wasn't trivially done in SQL, and the downsides are incredibly painful.

The big downside of sharding out, per customer, is that's a lot of databases to migrate on upgrades. Or rollback if shit hits the fan.

The upside? You can have customers on different versions of your app if you really wanted to do such a thing.

In any case, proper tooling goes a long way to making it the difference between wonderfully manageable and torturous nightmare. Think idempotent backup scripts that are capable of failing at any time and resuming where they died, etc.

darkwater · on April 13, 2022

All of your points (minus maybe the first one) should be "easily" solved/implemented in a company the size of Atlassian, and maybe there are newer costumers sharded like this already. IMO what happened in this case is basically tech debt that is now being paid with loooot of interests.

seanwilson · on April 13, 2022

Would it be fair to estimate that the majority of SaaS companies aren't sharding like this then? Seems like a lot of downsides that impact everything often except for backups, which you'd restore rarely.

mypalmike · on April 13, 2022

Per-customer is a common sharding strategy for noSQL databases, so it may not be entirely uncommon.

treis · on April 14, 2022

Migrations suck too.

oauea · on April 13, 2022

Work out a relationship graph and automate the export/import