Yeah, this seems problematic. What if you hard power off your server, how do you...

kouteiheika · on April 5, 2021

> What if you hard power off your server, how do you know your data didn't get corrupted?

Atomically replacing the whole file instead of updating the data in-place, and using a checksumming filesystem like ZFS should give you such guarantee, no?

> Backups and restore will also have to be DIY.

Well, yeah. A daily cron job that `scp`s the data to an offsite location. Personally I don't really see much problem with that? It's easy to backup, and easy to restore.

slashdev · on April 5, 2021

> Atomically replacing the whole file instead of updating the data in-place, and using a checksumming filesystem like ZFS should give you such guarantee, no?

No. You need to use fsync to ensure the file is flushed to disk. Also careful not to rename the file across mount points (/tmp is often a different mount and the most common location for people trying this.)

> Well, yeah. A daily cron job that `scp`s the data to an offsite location. Personally I don't really see much problem with that? It's easy to backup, and easy to restore.

That's fine, but your backup and restore granularity is daily. If you introduce a bug that writes the wrong / partial data, you can't recover anything that isn't in the previous day's backup.

With a typical database you have a transaction log and you could restore back to any point during the day.

This has been necessary about a dozen times during my career (once every two years or so on average.)

kouteiheika · on April 5, 2021

> No. You need to use fsync to ensure the file is flushed to disk.

Are you sure? At least in case of ZFS AFAIK the file wouldn't get corrupted, since it uses copy-on-write internally. Sure, you might get the old data back, but you wouldn't get corruption. And it automatically checksums your files anyway so it will detect any data that did get corrupted, and if you're using RAID-1 (as you should for any important data) it will automatically repair it.

> Also careful not to rename the file across mount points (/tmp is often a different mount and the most common location for people trying this.)

You can't rename across mount points. (Linux's `rename` syscall and the corresponding C API will return an EXDEV error if you try.)

> If you introduce a bug that writes the wrong / partial data, you can't recover anything that isn't in the previous day's backup.

Sure. If I was working in a bank not being able to do that would be totally unacceptable. But in my case that's the tradeoff I'm willing to live with.