Hacker News new | past | comments | ask | show | jobs | submit login

Too complicated to even try to read. Rsync is great but I've switched to Borg for backups. Borg isn't perfect but it is a higher level approach to backups, as it were. Hetzner recently dropped the price of their Storage Box backup product to about 2 euro per TB per month, and Borg works nicely with it. Borg encrypts all the backup contents and conceals the metadata on the backup server, and yet you can (with the encryption passphrase) mount the backup archive as a read-only file system through FUSE and access it through normal file navigation. It is impressive.



Make sure your borg repos are copying properly and in full!

I had a horrible realisation that my borg backups were timing out on the offiste copy, meaning the resulting offsite backup I had was non-existent. The heartbreaking error message Inconsistency detected. Please run "borg check [repository]" - although likely this is "beyond repair"

Course following the 3-2-1 rule you're probably good, but aye I'm treating borg repos as delicately as I would a striped raid now. I am also monitoring that `borg check` actually comes back successfully before considering the backup complete too :)

Also if you're in a small company, start up, etc.. Set yourself a 3-2-1+ backup system. + is the miracle backup that makes you look good.


> I had a horrible realisation that my borg backups were timing out ... meaning the ... backup I had was ...

A backup that isn't tested is not a true backup, it is a disappointment waiting to be found! This can happy with any backup tool, my hand-crafted¹ rsync based scripts included.

Testing isn't hard to setup if you don't mind the final step being manual. Snapshots have a checksum file, and daily one is picked and rechecked, any difference is an indication of bit-rot on the storage medium or something accidentally getting deleted/modified otherwise. After the newest copy is created by the main backup script a list of files not touched since the backup started is made and pushed up, the backup site checks those files and sends the result back so it can be compared. Any difference in these checksums results in an email to an account that makes my phone shout in a distressed manner. The manual part here is occasionally checking the results manually because not getting an email could either mean all is well or that something has broken to the point that the checks aren't running at all².

For specific systems like my mail server I have a replica VM, not visible to the Internet at large, that wipes itself and restores from the latest off-site backup. I look at that occasionally to see that it is running and has the messages I've recently sent and received. As a bonus this VM could quickly be made to be publicly available and take over with a few firewall DNS changes, should the main mail server or its host physically die, and even if it doesn't take over its existence proves that the restore method is reliable should I need to restore the one in the primary location. Some extra automated checks could be added to this too, but again there comes a point where writing the checks takes more time than just doing that manually³ and I'd still do it manually out of paranoia anyway.

[1] If I'm honest, “string together” would be much more accurate than “crafted”

[2] I could automate that a bit too, but then that automation still needs to be verified occasionally, it quickly gets to the point where it is double-checks all the way down and making sure you check manually occasionally is far far more maintainable a system.

[3] If these were a business thing rather than personal services, then the automated procedure vs manual checks desirability balance might change somewhat.


> A backup that isn't tested is not a true backup, it is a disappointment waiting to be found

Yup. 500 gigs of disk seatfiller (a full backup in this instance would have been somewhere between 1-2T)

Live and learn.. and learn again later when you let yourself slip :)

My backup process is a LOT noisier if any of the commands fail/timeout/dirs-arent-same-size-after now!


> Live and learn.. and learn again later when you let yourself slip :)

Definitely. I'm as careful as I am due to past issues, either my own or those of others that I've witnessed. Seeing the look on someone's face when they ask “You know about these things, you can do something to get it back right? Right?!”, and having to let them down…


This is why you gotta test your backups too


Indeed. You only get burned once

.. and then apparently once again a decade later when you let your guard down


Can Borg do this? Like, something you can set up to run once a month or so?


Borg has a command ("borg check") to test backup metadata. I'm not sure exactly what it does, but it is pretty slow, about 80 minutes to check a 1.6TB backup depending on how busy the backup server is. Another approach might be to mount the backup as a FUSE filesystem and monitor it with something like tripwire. I just thought of that a minute ago and haven't looked into it, so idk if it is workable. At the end of the day you have to occasionally do a full restore to another system, and test everything.


Fully agree, Borg is awesome. I wrote up how I'm using it here[1]. In short, borg backup to a local machine, and that machine uses rclone to copy the backups to an S3 bucket off-site. I've had multiple occasions to restore stuff successfully, and I never have to think about whether my data is retrievable.

[1]https://opensource.com/article/17/10/backing-your-machines-b...


You need to run “Borg check” once in a while to check and fix backup integrity errors. S3 egress fees will kill you!


There are several targets for borg that don't charge for traffic. Get off of AWS...


I used to do something similarly complex to the OP with ssh and --link-dest to make it share inodes so I could easily keep N days of backups with file level deduplication [1].

Then I moved to Borg and haven't looked back. It does the same end goal in a better way, is way faster, and easier to work with overall [2].

[1]: https://nuxx.net/blog/2009/12/06/time-machine-for-freebsd/ [2]: https://nuxx.net/blog/2019/11/10/using-borg-for-backing-up-n...


The cheapest I can find on https://www.hetzner.com/storage/storage-box is 3.49 Euro. I assume your talking about the bigger plans with price / TB?


Yes, the bigger ones are around 2 euro per TB. They also have Storage Share at around 3 euro/TB in the bigger sizes. Those have more features (they run nextcloud) and are themselves backed up several times a day.


I've been using Time4vps. It's a little bit more expensive, but it's also a full linux box you can do anything with...

I dump my backups and also run a calibre server.


I used rsync 20 years ago, switched to Borg, then switched to Restic. It’s nice to send backups to destinations like B2 or S3 without any special requirements. But the Hetzner Storage Box is good deal.

Now all I need is better client side software for these things, something I could use on my moms computer. Vorta is not that good, although I commend the effort.

Until the UI side is sorted, my moms laptop stays on Backblaze.


As a small film production we have approx. 120 TB to backup. Thats a lot of money per month, if you check AWS it is a fortune (half the value of a car) per month.


>Storage Box backup product

The script here really only makes sense for servers you physically own. You wouldn't accept SSH access from a key located on a plain VPS, right? Also this script doesn't seem to encrypt the data at all!! Very dangerous on a VPS.


> You wouldn't accept SSH access from a key located on a plain VPS, right?

With Borg, I use ssh -A from my laptop to start backups going. I can think of some other schemes like using multiple user accounts on a single Storage Box (Hetzner supports that and I believe offers an API) so that different VPS can't clobber each other's backups, that old backups become read-only, etc. It might be interesting to add some finer access controls on the server side. Borg supports an append-only mode but right now, that's only a config option rather than a security setting, I believe.

I've only recently started using Borg so I'm not really familiar with its intricacies yet. There are some things I would change but it is mostly well thought out, imho.


"Hetzner recently dropped the price of their Storage Box backup product to about 2 euro per TB per month, and Borg works nicely with it."

I'm still not clear - does the Hetzner storage box have the borg binary installed on their end ? As in, one could run:

  ssh user@hetzner borg --version
... or are you accessing borg over an sshfs mount, etc. ?

Asking for a friend ...


You can't run arbitrary shell commands on the storage box server. They have the Borg binary (1.17 last time I looked) installed on the server, and they officially support it. They seem to specially recognize the borg commands.


The First. Hetzner officially support Borg on their storage box


Is it possible to use it as a read/write mounted drive on macOS?


I think restic[1] does provide this.

[1] https://restic.net/


Thank you!




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: