We use "database-per-tenant" with ~1 mln users. Database-per-tenant is great for...

jasonthorsness · 2025-04-27T14:27:48 1745764068

This one-by-one migration nicely contains problems though if something goes wrong - only one customer affected at a time (or whatever your batch size was) I’ve done deployments this way pretty much everywhere.

netghost · 2025-04-27T19:58:39 1745783919

I'm curious, how do you handle being in a partial state? It seems like it could be operationally tricky to have some users migrated, some pending, and some in an error state.

On the other hand, I could imagine doing lazy migrations as user or tenants sign in as a nice way to smooth out the increased effort in migrating if you can engineer for it.

magicalhippo · 2025-04-27T23:15:25 1745795725

We've got a desktop B2B application which we also run with per-tenant DB. This makes the setup equal between our hosted service and on-prem.

When updating, schema changes goes first and when successful the new version is allowed to be executed by users. We have small launcher app that the users run which starts the actual application to handle this.

Our DB changes are always backwards compatible in the sense that the old version should operate correctly on an updated DB.

Incompatible changes like renaming a column we handle by rolling out the new column first, then wait till the version still using the old column is retired before removing the column from the schema, or similar strategies.

This way partial updates are safe, as the old app will just continue to use the partially updated DB as normal.

We're migrating to a web app, and hope to have something similar. Plan is we'll update the tenant DB and once done we'll change the app routing so that new user sessions for that tenant goes to new app instances running the updated version.

That said, we've not got 1 million users... yet.

njs12345 · 2025-04-27T21:12:36 1745788356

When not per tenant you still have this problem, and typically resolve by releasing database migrations separately to the code that uses changes (ensuring backwards compatibility).

I guess per-tenant you would handle similarly but build some tooling to monitor the state of migrations. It might even be easier in some ways, as you might be more able to take table locks when migrating per-tenant.

julik · 2025-04-27T15:41:30 1745768490

Good point. Also: nothing makes it impossible to apply migrations to multiple tenants in parallel, this is pretty much only IO. I didn't have to since my tenant numbers were very low, but very doable.

tyre · 2025-04-27T16:27:46 1745771266

Do you then run app instances per user? Or how do your upstream control for some DBs being on different schemas?

At least with Single Point of Success database design you either move or don’t.

jasonthorsness · 2025-04-27T18:09:07 1745777347

App per tenant would be nice but in practice mostly I’ve seen shared app instances with a careful dance of “schema changes can’t break app” and “app can’t use new schema until it is 100% rolled out” which is super annoying.

what · 2025-04-28T03:35:03 1745811303

How is that super annoying? That’s exactly how migrations are supposed to be done, even without “app per tenant”.

hobs · 2025-04-27T17:00:35 1745773235

I think its ok if you have a traditional server approach, but in the cloud this is a really great way to make yourself pay a lot for a little.

Most systems have a minimum amount of traffic they'll let you pay for, most object stores are not setup to stream small updates in a transactionally consistent way, there's a ton of complexity "in the small" as it were.

stackskipton · 2025-04-27T17:21:27 1745774487

Depends on the cloud and their model. Azure/Google have MySQL/PostGres options where you get a server they manage OS/Database software for but you can run as many databases as hardware will allow.

Some of other cloud databases don't charge you for database but simply for usage so in that case, usage = customer revenue so cost should scale as you do.

SOLAR_FIELDS · 2025-04-27T14:43:30 1745765010

Good call out, it definitely reduces the blast radius to do it this way. Poor man’s isolation

miketery · 2025-04-27T14:33:25 1745764405

How do you do metrics across users? Do you have long running or many jobs across tenants to get derived data into one downstream target?

bloopernova · 2025-04-27T16:25:32 1745771132

How large are the DBs?

Do you store the DB too, or rebuild it from your records if the client loses their DB?

dewey · 2025-04-27T16:34:21 1745771661

You still host the database yourself, clients don’t lose their database.

andrewstuart · 2025-04-27T18:27:03 1745778423

Why do you use database per tenant?

Why not use Postgres with row level security instead?

leetrout · 2025-04-27T18:28:43 1745778523

Postgres gives the option of schema per tenant which is a nice middle ground.

andrewstuart · 2025-04-27T18:34:03 1745778843

Postgres does hard tenant isolation via row level security.

Far better than database per tenant.

no_wizard · 2025-04-27T19:00:56 1745780456

> Far better than database per tenant

The better isn’t clear here, why is it better?

Database per tenant - barring configuration errors to a connection string or something along those lines - means you won’t ever accidentally leak over other customers data

With Row level security, as good as it is, there is always a chance of leaking happening, due to bugs, user error or the like

Database per tenant also makes migrations safer, and allows easier versioning of an application, for example if you run a SaaS you may put your V2 application into permanent maintenance mode and allow existing customers to use it in perpetuity, while upgrading customers and new customers are put on the V3 platform. This is infinitely easier in practice when the database is per tenant

kgeist · 2025-04-27T20:52:43 1745787163

>Database per tenant also makes migrations safer

Many of our clients eventually want to host our app on-premises, so moving it to an on-premises environment is quite easy with the database-per-tenant approach. Just copy the database as is.

andrewstuart · 2025-04-27T19:23:15 1745781795

>> With Row level security, as good as it is, there is always a chance of leaking happening, due to bugs, user error or the like

That’s not correct. It’s hard isolation. In effect, RLS automatically applies a where clause preventing developers inadvertently accessing the wrong data.

If you don’t set the Postgres environment variable then no data is returned.

All the hard isolation without the schema sync pain.

lgas · 2025-04-28T15:57:05 1745855825

> In effect, RLS automatically applies a where clause preventing developers inadvertently accessing the wrong data.

Right, but this assumes that you have a) remembered to enable RLS for every table where it matters and b) configured appropriate rules perfectly for every table. And there are tons of things you can mess up in the rules.

With per-tenant instances none of those things can happen.

andrewstuart · 2025-04-28T20:40:22 1745872822

True it doesn’t work if you do it wrong.

imtringued · 2025-04-27T23:47:39 1745797659

Database per tenant makes write scaling trivial.

scosman · 2025-04-27T18:55:46 1745780146

But read up on scaling limits before going down this path. Last time I checked it capped out at thousands of schemas, but no where near millions.

victorbjorklund · 2025-04-28T13:03:45 1745845425

Not true. There is no upper limit.

https://www.postgresql.org/message-id/24478.1109618520%40sss...

koakuma-chan · 2025-04-27T14:53:28 1745765608

Have you tried applying migrations lazily?

stackskipton · 2025-04-27T15:08:43 1745766523

I worked as Ops person for a company that was database per user and that's we did. Every database had table with its current schema. On login, that current schema would be checked, if it was not up to date, RabbitMQ message was sent and database schema updater would update their database. User would get a screen saying "Updating to latest version" and it was just checking every 15 seconds to see if schema field updated. Most of time, it was done in less than 15 seconds.

For more important customers, you could always preemptively run their database schemas updates for them so they didn't see update screen.

scop · 2025-04-27T19:59:36 1745783976

Interesting! I use a budgeting app that has curious loading screens when you first log in. Sometimes it flies by, sometimes it takes a bit. It doesn’t appear to be related to data import, as data isn’t in sync when you log in. I wonder if what I’m seeing is a tenant migration technique like you describe.

finnh · 2025-04-27T15:26:30 1745767590

(not a worthy comment, but)

I really like the free lunch / free launch pun here, intentional or not.