Hacker News new | past | comments | ask | show | jobs | submit login
An almost perfect rsync over SSH backup script (zazu.berlin)
199 points by bpasero on Feb 25, 2022 | hide | past | favorite | 126 comments



Beware that this script only uses rsync with the "--archive" flag.

This may be enough for some users, but "--archive" does not copy all file metadata, so it may cause surprises.

rsync must be invoked with "--archive --xattrs --acls" to guarantee complete file copies.

Unfortunately all command-line utilities for file copying that are available for UNIX-like operating systems use default options that do not copy most file metadata and they usually need many non-default flags to make complete file copies.

Nevertheless, rsync is the best of all alternatives, because when invoked correctly it accurately copies all file metadata even between different file systems and different operating systems, in cases when other copying utilities may lose some file metadata without giving any errors or warnings for the users.


Better than rsync is using support for snapshots and replication in the filesystem, such as described for ZFS in

https://serverfault.com/questions/842531/how-to-perform-incr...

Btrfs and apfs also support auto-snapshotting and replication, but I don't have experience with them.


ZFS snapshots also have the added benefit of not taking up duplicate space. And, should a Linux ransomware ever come into existence, snapshots help against them.


Btrfs and apfs also support copy-on-write snapshots.

Also, snapshots are NOT backups. And one definitely should not rely on snapshots along to keep their files safe.


They aren’t; a backup should preferably be offsite, but nothing is perfect


Rsync also doesn't track hardlinks by default! I use "-avPzHAXS" for my backups.


Also have a look at --numeric-ids, otherwise it translates UIDs back and forth according to real user names - which will end up in a terrible mess, especially when you are restoring the backup from a live "CD" (which has different UID ←→ username mapping than the target system).


I use getfacl to save permissions correctly, in case the backup server uses different user/group mappings. And setfacl after restoring.

(I wish this was handled by rsync.)


This also presumes that the target's rsync implementation and filesystem supports the same metadata.


Most modern file systems have been updated to support equivalent metadata.

However many old versions of file systems lacked support for things like high-resolution timestamps, extended attributes or access-control lists. Because such support has been introduced much later, there are a lot of programs for archiving, backup and copying which lose such metadata, at least with their default options.

While equivalent metadata exists on Linux XFS and EXT4, Windows NTFS, FreeBSD UFS and ZFS etc. each platform has different APIs for accessing file metadata (and on Linux the API may vary even between the many available file systems), so only few programs, like rsync, are able to copy everything while taking care to use the correct API to replicate the metadata without losing information.


Hey Adrian, thanks for the improvements and hints!


For years I had a custom script sync my ~/.ssh directory on my primary workstation to my laptop, to pick up new keys and config changes. It failed after I switched from Ubuntu to Fedora, and I was surprised to discover --xattrs fixed it.

tl;dr try this if rsync fails in unexpected ways:

  echo "Syncing ~/.ssh directory"
  rsync --archive --delete --xattrs ~/.ssh/ laptop:.ssh/


Probably because 99% of users do not have or need xattrs/acls.


I've been doing rsync-based backups of close to a thousand systems for ~20 years, most notably for a long time I backed up the python.org infrastructure, and I have quite a few thoughts on this. I also have a battle-tested rsync wrapper that I'll point to below.

- Backups should be automatic, only requiring attention when it is needed. This script philosophy seems to be "Just do your best, mail a log file, and rely on the user to figure out if something didn't work". Even for home backups, this is just wrong.

- As an example of the above: This script notes that it fails if a backup takes more than 24 hours.

- The "look for other rsyncs running" part of the code is an odd way of approaching locking, but for a single personal "push" backup I guess it is ok.

I've got an rsync wrapper that has been battle tested over a couple decades and hundreds of servers here: https://github.com/linsomniac/tummy-backup/blob/master/sbin/...

Features of it are:

- As the filename implies, the goal is to rsync to a zfs destination, and it will take a zfs snapshot as part of this. It is easy to customize to another backup destination, I've had people report they have customized it for their own laptop backups, for example to an rsync.net destination.

- It goes out of its way to detect when rsync has failed and log that.

- It does do "inplace" rsyncs, which dramatically save space if you have large files that get appended to (logfiles, ZODB databases).

- This is part of a larger system that manages the rsyncs of multiple systems, both local and remote. Things like alerting are done if a backup of a system has failed consistently for many days.

- In the case that there are no failures, there is no e-mail sent, meaning the user only gets actionable e-mails.

The hardlink trick only works for fairly small data sets. Issues include: Managing hard links takes a lot of overhead, especially on spinning discs. Large files being appended to use a ton of space (a 4GB file with 1K appended every day uses 128GB to store 14 dailies, 6 weeklies, and 12 monthlies). ZFS is a pretty good destination for rsync, as similar snapshots will use 4GB to store.


This SO post goes over the semantics of `--inplace` and `--no-whole-file`

  --inplace                update destination files in-place
  --whole-file, -W         copy files whole (w/o delta-xfer algorithm)
It is very important to debug rsync scripts with a test corpus and not on live data, preferably on test vm or container.

https://superuser.com/questions/576035/does-rsync-inplace-wr...

Also, when transferring from different systems, make sure that both rsyncs are of a high enough version, again preferably the same.

MacOS ships with a really old version of rsync that doesn't support extended attributes at `/usr/bin/rsync`

   rsync  version 2.6.9  protocol version 29
This is kinda a large footgun.


HN's new pet word; footgun



> - In the case that there are no failures, there is no e-mail sent, meaning the user only gets actionable e-mails.

I've always thought this isn't the right approach.

How do you know if the email server is borked or you commented out the script in cron to debug it and forgot to put it back in?

Either there's a weekly status report to tell things have been green or you could place cron checks like healthchecks.io (You can self host it.)

Also it's much better to use 'zfs send' with large volumes to backup if both ends have zfs as zfs knows which files have changed and it doesn't have to scan for what has changed on each go as any other tools do like rsync.

https://arstechnica.com/information-technology/2015/12/rsync...


I get your point, and different people have different things they are comfortable with. Some of that also depends on your environment, in my case I've had this script running nightly backups across hundreds of machines over a dozen years. Sending hundreds of e-mails a day into my inbox isn't going to be workable. But if you have one or two machines, maybe it is. I still don't think so, but again different people.

Things I have done to ensure reliability (again, this core script has been running for a dozen or more years):

- Nagios monitoring of backups: An active check from a monitoring server that alerts if no recent successful backups.

- "paper path" monitoring of e-mail: Send an e-mail to an external mailbox and have an active check in Nagios that reports if it has not seen an e-mail recently.

- With hundreds of machines, we were in the management interface enough (not daily, but at least monthly) that we would tend to notice before TOO long if something was out of whack.

- Regular backup audits: We would perform quarterly backup audits of the important machines, we had a whole workflow for those, which would also give us confidence that the backups were running as expected and that if something got out of whack it didn't go too long. Many of these depend on your definition of "too long".

As far as "zfs send", I totally agree. However, even today I have very few machines other than my backup machines that are running ZFS, so that's not really an option for these backups.


The first two lines of the script are already wrong;

    #!/bin/bash
Should be:

    #!/usr/bin/env bash
    set -euo pipefail
That’s table stakes for any bash script. With the first piece, exit on error, being critically important.


I wish this misguided notion regarding "inofficial strict mode" would go away, as `set -e` is suboptimal (and can be very tedious to deal with on top of that): https://mywiki.wooledge.org/BashFAQ/105

It makes sense to opt into errexit for select blocks/sections in scripts and under certain circumstances, but having it default-on is a recipe for quite a bit of head-scratching in the future.


Using -e isn't an excuse to not understand how it works and when it doesn't. It's so that the default in most situation is to exit on failure as that's likely what you want to do. That leads to more terse scripts which hopefully are easier to write correctly and understand. I'd argue that opting in to exit-on-failure for stand alone commands is the right default.


No, absolutely wrong. The ”right default” is to catch return codes and provide actionable, clear and consistent error messages through an error-exit function.


If I'm writing a bash script I run manually and it's more than a handful of functions I agree. I'm there to babysit it and its probably complicated.

If the bash script is ran on thousands of containers where its not possible to babysit. My number one job is to stop immediately when an error happens and surface that error to any monitoring system.


I agree about that distinction, but still, won’t that error need to be formatted? Is it safe to rely on logging picking up on the error or does the ”simple script” solution imply that there is monitoring for the script exit code?


If I may add a few critiques...

  # Make sure no one else is using rsync
  pro_on=$(ps aux | grep -c rsync)
A better way to do that is with the flock utility.

  (
    flock -n 9 || exit 1 # Critical section-allow only one process.

    ...single thread shell script

  ) 9> ~/.empty_lock_file
Note that the flock utility is specific to Linux, but POSIX mkdir() is atomic and could be more portable.

  "${SOURCES[@]}"
POSIX shells do not support arrays. Iterating with read over a here document is more portable.

  minutes=$(($minutes - 1))
POSIX is specific that the $ prefix on a variable name can be omitted in an arithmetic expression.

  ECHO="/bin/echo"
Many shell scripts never use echo, and this is a good idea. 'NEVER use echo like this. According to POSIX, echo has unspecified behavior if any of its arguments contain "\" or if its first argument is "-n".' http://www.etalabs.net/sh_tricks.html

Perhaps use this instead, in a subshell to avoid stomping on variables:

  myecho () ( z=''; for x; do printf "$z%s" "$x"; z=' '; done; )


#!/usr/bin/env bash and set -u are always good ideas.

There are cases where you don't want -e enabled, such as when you want to make sure your script makes the best attempt to continue operating even through unknown failures.

Using pipefail makes it more likely your script will fail unexpectedly and without a known cause. You have to check PIPELINE to see which command in a string of pipes failed and then report on it. This is often pointless, because usually just checking the output of the last pipe will tell you whether you got what you wanted.

When your script does fail unexpectedly, you'll want to re-run it with at least tracing enabled, so the third line should be something like

  [ "${DEBUG:-0}" = "1" ] && set -x


> Using pipefail makes it more likely your script will fail unexpectedly and without a known cause. You have to check PIPELINE to see which command in a string of pipes failed and then report on it. This is often pointless, because usually just checking the output of the last pipe will tell you whether you got what you wanted.

Really? In this script's "ps aux | grep -c rsync" for example, if "ps" fails, you'll just get 0 without the grep failing.

(Speaking of that line: chasil's completely right that it's much better to use "flock" than "ps" for locking...)


> There are cases where you don't want -e enabled, such as when you want to make sure your script makes the best attempt to continue operating even through unknown failures.

You don't let go of -e for that.

  dont_mind_failure || true
  important_process
add 'true' specifically if you must.


There are scripts where you don't want any failure to kill the script. You would have to wrap nearly every line in that kind of exception handling, and if you miss one, the script fails. It is much simpler to omit -e in that case


I've been doing my scripts for decades on various, numerous, Linux and UNIX systems, real and virtual machines and never used env shell nor have I seen "set -euo pipefail" and have had zero issues.

Saying "The first two lines of the script are already wrong;" is wrong. Is that better? IDK, maybe. But "#!/bin/bash" works fine.


> "#!/bin/bash" works fine

It doesn't work at all on any BSD OS, which does not store bash in /bin - instead it is in /usr/local/bin

Specifying "env bash" makes it work on any UNIX, since the location of env is a constant, unlike bash.


does it matter? MacOS is using some ancient ass 3.x bash from 2007. The location of bash is the least of your worries when it comes to portability. There's no guarantee any of the commands in your script (a) exist on the system or (b) work with the same options and flags that the script uses. I don't even know how many "netcat" commands I've seen in the wild. You could be running in some BusyBox or pared down Docker container. Blindly running a script that hasn't been deliberately crafted to work on your system is just asking for trouble.


You call 2007 ancient when bash has been around for over 30 years?

What exactly has changed in the last 15 years?


Agreed on fail on error, but the first one is needlessly pedantic.

Find me a single Linux distro where bash, if installed, is not available in /bin


As a system eng for almost 15 years I've spent a good number of years handling deployments, dependancies, environment evaluations and all the little minutiae that's required to run large complex distributed systems at scale. This is fine to ignore if it's your personal box and no one else is working on it.

However I've seen it happen quite frequently in systems that were designed with container like 'chroot-lite' prod setups where the system bash and the deployed environment may contain different bash.

The different bash is the one that was tested in the test env with the automation. There may even be multiple different versions of an interpreter on the system with multiple app environments running.

This was a pretty common way to package apps before easy access to containers and container managers.

This is why we have industry best practices, so that people who don't understand why something exists can just follow the best practices and we don't all have to be experts in things outside of our direct field.


> This is why we have industry best practices, so that people who don't understand why something exists can just follow the best practices and we don't all have to be experts in things outside of our direct field.

We don't have industry best practices for shell scripts, no matter what consultants / HN commenters with strong opinions say. You can see in this thread that folks disagree about what the best practices are. It's worth paying attention to folks' rationale for their opinions—I learned some caveats about "-e" from following the links here. But the talk about "table stakes" and "industry best practices" is (extended bleep). Those don't exist.

Different projects/companies may have their own best practice guides that make sense in their environment. senko mentioned Google data centers as a place where it might make sense to be more rigorous. Google's guide says to use "#!/bin/bash". [1] If that doesn't work in your environment, fine, but that doesn't make them wrong.

Shell is a surprisingly and unnecessarily difficult language to write correctly. To the extent there is a best practice on it, I think it's "use a better language for anything that might become large or important". The Google style guide I linked says more or less the same thing near the beginning. The subtleties discussed in the rest give you a taste of why...

[1] https://google.github.io/styleguide/shellguide.html


I was a sysadmin on SunOS, HP/UX and BSD systems back in the day and have been maintaining various Linux-based systems in the past few decades as well.

We're not talking about Google data center here, it's just someone's shell script.


On my laptop (admittedly, it's running Mac OS, not a linux distro), /bin/bash is version 3.2.57 released in 2007, while '/usr/bin/env bash' invokes bash 5.1.8, released in 2020.


I'd argue it's easier to find a distro without /usr/bin/env.


NixOS


1. bash runs on more than just linux

2. Many non-LSB distros


The article talks about Linux.


There are more than Linux in the real world.


Not using “env” isn’t “wrong”. It depends how many platforms they want to supply

Bet yes pipefail and nounset should definitely be set.


Support*


rsnapshot[1] is what I used on FreeBSD with a snapshot. It is like a well-tested version of the author's rsync and ssh blog post. I have a blog post here[2] describing my setup. It saved my bacon multiple times. Also, rsnapshot works in pull mode, so no client is needed on my Linux/macOS desktop or server except ssh and rsync.

[1] https://github.com/rsnapshot/rsnapshot [2] https://www.cyberciti.biz/faq/howto-install-rsnapshot-filesy...


Tangential question: I've been using rsnapshot to backup a remote machine to my local NAS for a while now. My impression is that even for an incremental backup, everything that I want to have backed up is transferred over the wire, and it is determined locally what actually needs to be written to disk, and what can be hardlinked from a previous backup. Is there a way to configure rsnapshot so that it only transfers the data that's actually changed?


That is the default. rsnapshot uses rsync between local and remote, and rsync uses delta encoding algo so it is speedy. You may need to tune your ssh, networking stack and rsync.


Yep. We use rsnapshot with btrfs snapshots on Debian / Ubuntu. You can set up multiple generations easily. Efficient storage using hard links.

Very reliable.


AFAIK the crucial part of this, is rsync's `--link-dest=DIR` parameter, which hard-links an unmodified file to the respective place in "DIR", rather than making a copy.


The docs for my version of rsnapshot say not to use '--link-dest' if GNU cp is available, because the script will use 'cp -l' instead when copying "daily-1" to "daily-0" (for example). This is before it will overwrite "daily-0" with new files via rsync.


Could you elaborate on where the btrfs snapshots come into play? I use rsnapshot on a plain ext4 fs and it’s pretty slow (lots of small files).


The snapshots don't help with the speed.

Instead, they make sure the files aren't changing / being deleted between when the rsync starts and finishes.


Too complicated to even try to read. Rsync is great but I've switched to Borg for backups. Borg isn't perfect but it is a higher level approach to backups, as it were. Hetzner recently dropped the price of their Storage Box backup product to about 2 euro per TB per month, and Borg works nicely with it. Borg encrypts all the backup contents and conceals the metadata on the backup server, and yet you can (with the encryption passphrase) mount the backup archive as a read-only file system through FUSE and access it through normal file navigation. It is impressive.


Make sure your borg repos are copying properly and in full!

I had a horrible realisation that my borg backups were timing out on the offiste copy, meaning the resulting offsite backup I had was non-existent. The heartbreaking error message Inconsistency detected. Please run "borg check [repository]" - although likely this is "beyond repair"

Course following the 3-2-1 rule you're probably good, but aye I'm treating borg repos as delicately as I would a striped raid now. I am also monitoring that `borg check` actually comes back successfully before considering the backup complete too :)

Also if you're in a small company, start up, etc.. Set yourself a 3-2-1+ backup system. + is the miracle backup that makes you look good.


> I had a horrible realisation that my borg backups were timing out ... meaning the ... backup I had was ...

A backup that isn't tested is not a true backup, it is a disappointment waiting to be found! This can happy with any backup tool, my hand-crafted¹ rsync based scripts included.

Testing isn't hard to setup if you don't mind the final step being manual. Snapshots have a checksum file, and daily one is picked and rechecked, any difference is an indication of bit-rot on the storage medium or something accidentally getting deleted/modified otherwise. After the newest copy is created by the main backup script a list of files not touched since the backup started is made and pushed up, the backup site checks those files and sends the result back so it can be compared. Any difference in these checksums results in an email to an account that makes my phone shout in a distressed manner. The manual part here is occasionally checking the results manually because not getting an email could either mean all is well or that something has broken to the point that the checks aren't running at all².

For specific systems like my mail server I have a replica VM, not visible to the Internet at large, that wipes itself and restores from the latest off-site backup. I look at that occasionally to see that it is running and has the messages I've recently sent and received. As a bonus this VM could quickly be made to be publicly available and take over with a few firewall DNS changes, should the main mail server or its host physically die, and even if it doesn't take over its existence proves that the restore method is reliable should I need to restore the one in the primary location. Some extra automated checks could be added to this too, but again there comes a point where writing the checks takes more time than just doing that manually³ and I'd still do it manually out of paranoia anyway.

[1] If I'm honest, “string together” would be much more accurate than “crafted”

[2] I could automate that a bit too, but then that automation still needs to be verified occasionally, it quickly gets to the point where it is double-checks all the way down and making sure you check manually occasionally is far far more maintainable a system.

[3] If these were a business thing rather than personal services, then the automated procedure vs manual checks desirability balance might change somewhat.


> A backup that isn't tested is not a true backup, it is a disappointment waiting to be found

Yup. 500 gigs of disk seatfiller (a full backup in this instance would have been somewhere between 1-2T)

Live and learn.. and learn again later when you let yourself slip :)

My backup process is a LOT noisier if any of the commands fail/timeout/dirs-arent-same-size-after now!


> Live and learn.. and learn again later when you let yourself slip :)

Definitely. I'm as careful as I am due to past issues, either my own or those of others that I've witnessed. Seeing the look on someone's face when they ask “You know about these things, you can do something to get it back right? Right?!”, and having to let them down…


This is why you gotta test your backups too


Indeed. You only get burned once

.. and then apparently once again a decade later when you let your guard down


Can Borg do this? Like, something you can set up to run once a month or so?


Borg has a command ("borg check") to test backup metadata. I'm not sure exactly what it does, but it is pretty slow, about 80 minutes to check a 1.6TB backup depending on how busy the backup server is. Another approach might be to mount the backup as a FUSE filesystem and monitor it with something like tripwire. I just thought of that a minute ago and haven't looked into it, so idk if it is workable. At the end of the day you have to occasionally do a full restore to another system, and test everything.


Fully agree, Borg is awesome. I wrote up how I'm using it here[1]. In short, borg backup to a local machine, and that machine uses rclone to copy the backups to an S3 bucket off-site. I've had multiple occasions to restore stuff successfully, and I never have to think about whether my data is retrievable.

[1]https://opensource.com/article/17/10/backing-your-machines-b...


You need to run “Borg check” once in a while to check and fix backup integrity errors. S3 egress fees will kill you!


There are several targets for borg that don't charge for traffic. Get off of AWS...


I used to do something similarly complex to the OP with ssh and --link-dest to make it share inodes so I could easily keep N days of backups with file level deduplication [1].

Then I moved to Borg and haven't looked back. It does the same end goal in a better way, is way faster, and easier to work with overall [2].

[1]: https://nuxx.net/blog/2009/12/06/time-machine-for-freebsd/ [2]: https://nuxx.net/blog/2019/11/10/using-borg-for-backing-up-n...


The cheapest I can find on https://www.hetzner.com/storage/storage-box is 3.49 Euro. I assume your talking about the bigger plans with price / TB?


Yes, the bigger ones are around 2 euro per TB. They also have Storage Share at around 3 euro/TB in the bigger sizes. Those have more features (they run nextcloud) and are themselves backed up several times a day.


I've been using Time4vps. It's a little bit more expensive, but it's also a full linux box you can do anything with...

I dump my backups and also run a calibre server.


I used rsync 20 years ago, switched to Borg, then switched to Restic. It’s nice to send backups to destinations like B2 or S3 without any special requirements. But the Hetzner Storage Box is good deal.

Now all I need is better client side software for these things, something I could use on my moms computer. Vorta is not that good, although I commend the effort.

Until the UI side is sorted, my moms laptop stays on Backblaze.


As a small film production we have approx. 120 TB to backup. Thats a lot of money per month, if you check AWS it is a fortune (half the value of a car) per month.


>Storage Box backup product

The script here really only makes sense for servers you physically own. You wouldn't accept SSH access from a key located on a plain VPS, right? Also this script doesn't seem to encrypt the data at all!! Very dangerous on a VPS.


> You wouldn't accept SSH access from a key located on a plain VPS, right?

With Borg, I use ssh -A from my laptop to start backups going. I can think of some other schemes like using multiple user accounts on a single Storage Box (Hetzner supports that and I believe offers an API) so that different VPS can't clobber each other's backups, that old backups become read-only, etc. It might be interesting to add some finer access controls on the server side. Borg supports an append-only mode but right now, that's only a config option rather than a security setting, I believe.

I've only recently started using Borg so I'm not really familiar with its intricacies yet. There are some things I would change but it is mostly well thought out, imho.


"Hetzner recently dropped the price of their Storage Box backup product to about 2 euro per TB per month, and Borg works nicely with it."

I'm still not clear - does the Hetzner storage box have the borg binary installed on their end ? As in, one could run:

  ssh user@hetzner borg --version
... or are you accessing borg over an sshfs mount, etc. ?

Asking for a friend ...


You can't run arbitrary shell commands on the storage box server. They have the Borg binary (1.17 last time I looked) installed on the server, and they officially support it. They seem to specially recognize the borg commands.


The First. Hetzner officially support Borg on their storage box


Is it possible to use it as a read/write mounted drive on macOS?


I think restic[1] does provide this.

[1] https://restic.net/


Thank you!


I use Restic (https://restic.net/) for backups. Encrypted backup, incremental, deduplicating, and supports many storage backends.

Most recent discussion on Hacker News: https://news.ycombinator.com/item?id=29209455


Sorry to say none of these modern backup tools work reliably when backup volumes get large except borg.

https://news.ycombinator.com/item?id=29210222

Using zfs snapshot is the best way to go but having a cloud destination wouldn't be easy for these tools unfortunately but there are services that accept borg and zfs send.


I like restic, but doing rsync scripts like this always win out for me because I find myself wanting to stage the backups on the destination systems, and not the source/prod ones.


I really, really wanted to love restic, but it makes far, far too many tiny size files for my main remote backup use-case: backing up dozens of TB offsite to Glacier Deep Archive.

Eagerly awaiting a configurable chunk size, if they decide to do that.


Perhaps Kopia will work for your use case:

https://kopia.io/

https://kopia.io/docs/advanced/amazon-s3/


Kopia looks great! I am currently using Duplicati for my personal machines, but Mono and its dependencies have been less than reliable. borg/restic/... look good but seem like only part of a solution, having to write my own scripts and crontabs with monitoring etc on top seems counter-productive.

edit: No web interface?


Perfect? Ha.

Wrong ps grep usage. MacOS specific. No shell quoting.

At least he doesn't advocate rm -rf / by an incorrect usage of --delete.


> Perfect? Ha.

The cynic in me believes this was intentional to drive engagement and discussion.

It's that old adage of "if you want to learn something just say something wrong about it on the Internet" at work.


There's a million blog posts out there with "the perfect <whatever> setup" which is this circle jerking way of saying that its the author's preferred configuration.

I don't entirely know what the motivation is, but as a grumpy old GenXer, I wish it would stop.


A shell on a Mac is the "we have x at home" meme to me. Very frustrating.


You missed the word "almost" just before the word "perfect": meaning there is space for improvementes and any coding has, if you dig deeper, a never ending context. So, you see something "not so or not at all perfect" just name it to improve it. THX


It's nice that the script works both ways, but keep in mind that it's always best to pull backups from a remote system, so that an attacker in the production system can't log into the backup system and delete the backups.

If your prod system(s) is/are logging into the backup system, then you have a big problem because if any of them are compromised, they can wipe out / corrupt / etc all of their own backups, and possibly the other server's backups as well (if the same user account is used for all of them.) This problem goes away if your (isolated, hardened) backup server logs into the prod systems. Of course, the inverse problem is that if an attacker manages to get into your backup system, then they also have (at least read) access to all of your prod systems!


What seems like a long time ago, I was working at a place that used Linux, but not really any Linux experts. They had backup scripts that assumed (and did not check) that you would connect an external drive. If you didn't, the script would gladly copy files to /mnt/usbdrive, and if the server was full enough, backing up itself to itself would fill the drive and bad things would then happen.

There was no email notification if the backup completed or failed, or even started to run. There was no notification of how long it took, how much data was backed up, etc.

There were plenty of incidents of data loss.


For rsync tasks that require scheduling and need to ensure only a single instance executes, I personally prefer creating a systemd service (type=oneshot) and have that run under a systemd timer (set OnBootSec, OnUnitActiveSec)

https://wiki.archlinux.org/title/Systemd/Timers

EDIT: I'd love to hear if anyone knows of any downsides, edge cases for running rsync under systemd timers.


> # avoidng collisions with other rsync processes

Use https://github.com/instacart/ohmycron

> MONTHROTATE=monthrotate # use DD instead of YYMMDD

Use https://rotate-backups.readthedocs.io/en/latest/readme.html

> $RSYNC -avR "$SOURCE" "${RSYNCCONF[@]}"

Create a one-command script with the hardcoded rsync command you want to use and replace the directory to sync as a command-line argument, e.g.

  #!/bin/sh
  rsync -avR \
    --delete \ 
    --exclude=/Volumes/Raid/.DocumentRevisions-V100 \
    --exclude=/Volumes/Raid/.TemporaryItems --exclude=/Volumes/Raid/.Trashes \
    --exclude=/Volumes/Raid/.apdisc \
    "$@"
> $DATE >> $LOG

Use logrotate

> mail a report

Use Cron's built-in mail feature


I didn't read the whole article but the reason I like rsync is that I get files on a normal filesystem. I don't need any special backup software to read or access the files.


I totally agree. Do not underestimate the obstacle a family member will face when he needs to open a borg or restic repository to access a backup of your family photos.

I use restic for most of my backups but hand-written rsync on an external drive with ntfs for the stuff that would be important to my family.


Restic is currently the best solution in my view.

The backup script is much simpler, the repository is properly encrypted, takes snapshots, dedup, mounts the remote, integrates with backends, has clean output, and various useful features for working with repositories.

Rsync over SSH is not even encrypted at rest.


My go-to backup solution, that can also manage rsync over ssh (among plenty other things), is synbak[0]. Short wrapper script to automatically mount the backup medium and it's really simple to (automatically) run a backup job. I use encrypted (LUKS) USB drives for that at home. Highly recommend it.

For encrypted backups to e.g. a NAS, Duplicity[1] is my go-to choice (full backup every month or so, with incremental backup every day inbetween).

[0]: https://github.com/ugoviti/synbak

[1]: https://duplicity.gitlab.io/duplicity-web/


Can't help but throw a couple cents/pennies around whenever an "almost perfect" or "advanced" script makes my knee jerk so wildly. So here goes. For this part:

    pro_on=$(ps aux | grep -c rsync)

    # if someone is using rsync
    # grep is also producing one entry so -gt 1
`grep` can be excluded from the output by putting square brackets around one character of the search pattern, like so:

    pro_on=$(ps aux | grep -c rsyn[c])
Also, this:

    minutes=$(($minutes - 1))
Can be more easily written as:

    ((minutes--))


> `grep` can be excluded from the output by putting square brackets around one character of the search pattern

That's a great trick, thanks! I'm so used to adding '| grep -v grep' at the end of a 'ps | grep' command, your way is much nicer.


On Linux, pgrep is probably what you need most of the time, or pkill.


I like the `psgrep` tool which behaves much more like greping ps that pgrep does. And `psgrep` is actually just a bash script itself.


THX the "almost" is getting closer to "almost"!


Guess they never heard of rsnapshot?? [0]

[0]: https://github.com/rsnapshot/rsnapshot


I love the irony of a document about a "perfect" script with a grammatical error in the very first word of the title.


Does anyone have experience with Duplicati [1] or recommend it? I am tentatively looking to make the switch to something a little more sophisticated than manually tar-balling + rsyncing backups.

[1] https://www.duplicati.com/


I’ve that in use on my personal pc (Ubuntu) for my personal files. Bringing in the pictures via Dropbox. It works flawlessly for about 1.5 yr and 150GB to Backblaze. What put me off initially is that the most important part of backups: testing the restore process was / is less documented. Got it tested by just using common sense and a fresh temporary install. (Any pointers to how to do automatic integrity tests greatly appreciated.)


Do yourself a favor and use borg.

https://news.ycombinator.com/item?id=29210222

If you have extra space, you could rclone the result of locally run borg to a cloud storage or find a service that accepts borg directly.


I haven’t used it, but the comments on Reddit indicate there are issues. You may easily find them.


About 10 years ago, I was earning my CS degree, and my Intro to Unix teacher was showing us how to use rsync. He somehow made an error that synced an empty directory to his home directory, deleting everything from his home directory.

The lesson I learned in that class was to not use rsync, sadly.


> The lesson I learned in that class was to not use rsync, sadly.

This is a poor lesson! If your TA tripped on the stairs and broke their nose, would you swear off stairs forever?

rsync is an essential tool for anyone who works with files in the UNIX world. There are other options, sometimes, but rsync is almost-always present, and does almost-everything you usually want.

Any tool can be used incorrectly, but rsync is easy and follows the same pattern as most other UNIX file copy tools. If you can use cp, you can use rsync safely.


I think there's a bug in this; the proc_on calculation doesn't account for multiple lines, it just calculates if the output of grep is greater than 1? grep -c may not be available on all systems, so a more portable way to do this is piping to 'wc -l'


Honestly, I just use duplicity. It is easy to test backups and has a nice GUI if you're that way inclined. I'm too old for bash scripts when the stakes are a lifetime of memories or critical data at work.


I wish more of these backup solutions (even simple scripts like this) would include zero trust features.

I'd like to have a backup system that I didn't just kludge together myself over a weekend that accomplishes both delta change detection and rotation as well as allows me to keep my data encrypted with a key only I control.


Haven't read the page yet but the header image struck me as interesting.

"Molex to SATA, lose all your data" was drilled into my head at an early job and appears to be what's depicted there.

If your disks are set up in a similar way, you might have a very immediate need for a good backup script :)


Any SATA connector that is powering the drive has little pins and can be not well made and cause fire. Yes, on the Molex site is a big amount of amperes that can go through. But you hardly find a PSU with 24 direct outputs for HDDs.


That's a weird mantra, molex to sata is perfectly fine.


I'd never heard of it either but found this interesting youtube video: <https://www.youtube.com/watch?v=fAyy_WOSdVc>.

tl;dw: many of these adapters are so poorly made they might catch on fire. They put the (insulated) wires in and then pour in something kind of like hot-melt glue around them (to hold them in place, and/or for extra insulation, not sure). Sometimes the wires are well separated within that medium, but sometimes they're touching, and possibly the wire's own insulation is melted by the glue. There also seems to be some contaminants in the glue, and possibly corrosion from the way it was soldered.

At the end, he shows a better kind where crimped wires are put into hard plastic channels, which reliably keeps them in place without the same risk of compromising their insulation.


could you explain why it's a bad idea to use molex to sata?

did your guys use shitty injection molded connectors that melted or is there another reason?


I didn't really know why until I read scottlamb's comment, and yours.

I think this is the only plausible explanation - someone had a bad experience with a really crappy connector and it stuck.


The script could use --rsync-path='sudo rsync', so that your rsync backup script can access root-owned files on the remote system. It looks like you can add that to the RSYNCCONF part of "subpart of part 2".


For security reasons the user "backup" on the linux acting backup system is not part of the sudo users. Having sudo rights will to my knowledge not help to excess root files on the remote system?


You can already solve this quite trivially with Dropbox.


I use duplicity. Far from perfect, but I hope it's better than rsync and similar.


Very useful article and thread comments, thank you all.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: