ZFS is very special, and it is cheap to make snapshots with ZFS, because ZFS uses copy-on-write.
Intuitively I would think that the amount of extra writes is pretty low, even if you snapshot very frequently.
But scientific measurements would be nice.
I used to do snapshots every minute, every hour and every day with ZFS on some servers I administered. I’d purge the minute snapshots after 60 minutes. And I had cron jobs on other machines to backup the hourly and daily snapshots. I had it set up so that hourly snapshots were kept for something like 72 hours. And the daily snapshots were kept forever.
The idea with the every minute snapshots being that they were for undoing manually made mistakes during SQL migrations etc.
It worked well for me.
I still use ZFS on my FreeBSD servers. But at the moment my projects are low traffic and the data only changes in important ways some rare times. So with my current personal servers I manually snapshot about once a week and manually trigger a backup of that from another server.
Another thing I’ve changed is that now I only snapshot the parts of the file system where I store PostgreSQL databases and other application data. I no longer care so much about snapshotting the operating system data and such. If I have a serious hardware malfunction I will do a fresh install of the OS, and I have a log of what important config values are used and so on, that my backup scripts copy when I run them, without copying all of the other things.
Copy-on-write and cheap snapshots was quite special when ZFS was created. It’s hardly special in 2023, when every single non-vintage iPhone, iPad and Mac has that.
Not if the filesystem itself can do atomic snapshots. The problem can happen if you try to copy the files or even the device when it's being written to. But if you create a snapshot and copy that, it would be consistent.
You can of course the up in a dirty state where some new transaction was started but not committed, but any non-toy database should be able to recover from that. (You'll lose the transaction of course)
it’s equivalent to backing up a server after a hard power off.
as long as the filesystem supports some kind of journaling (aka it’s not ancient) and the database is acid compliant, there shouldn’t be any major issues beyond a slow startup.
but keep in mind that you may lose acid compliance by fiddling with the disk flushing configuration, which is a common trick to raise write speed on write choked databases. if that’s the case you may lose some transactions
your database documentation should have this information.
Mainly if you restore a previous version of the on-disk data while the DBMS is still running I would think. Because then the idea that the running DBMS has of the data no longer matches what’s on disk.
But if you stop the running DBMS before you restore the previous version on disk, and then start the DBMS again it should be able to continue from there.
After all that’s one of the key selling points of a bonafide DBMS like PostgreSQL, that it’s supposedly very good at ensuring that the data on disk is always consistent, so that when your host computer suddenly stops running at any point in time (power outage, kernel crash, etc) the data on disk is never corrupted.
If data is corrupted by restoring from a random point in time in the past, that should be considered a serious bug in PostgreSQL.
It depends on the DBMS, its configuration, the host OS, its configuration, and many other details. Snapshotting running databases is possible and often "just works," but you should always verify that before relying on this functionality in production.
> you should always verify that before relying on this functionality in production
Yep :)
At said company where I was using ZFS snapshots for the servers I administered, I additionally had a nightly cron job to dump the db using the tools that shipped with the DBMS. Just in case something with the ZFS snapshots fricked itself :)
Intuitively I would think that the amount of extra writes is pretty low, even if you snapshot very frequently.
But scientific measurements would be nice.
I used to do snapshots every minute, every hour and every day with ZFS on some servers I administered. I’d purge the minute snapshots after 60 minutes. And I had cron jobs on other machines to backup the hourly and daily snapshots. I had it set up so that hourly snapshots were kept for something like 72 hours. And the daily snapshots were kept forever.
The idea with the every minute snapshots being that they were for undoing manually made mistakes during SQL migrations etc.
It worked well for me.
I still use ZFS on my FreeBSD servers. But at the moment my projects are low traffic and the data only changes in important ways some rare times. So with my current personal servers I manually snapshot about once a week and manually trigger a backup of that from another server.
Another thing I’ve changed is that now I only snapshot the parts of the file system where I store PostgreSQL databases and other application data. I no longer care so much about snapshotting the operating system data and such. If I have a serious hardware malfunction I will do a fresh install of the OS, and I have a log of what important config values are used and so on, that my backup scripts copy when I run them, without copying all of the other things.