Your script doesn't do the same thing as duplicity. Your script mirrors the loca...

65 · on Jan 24, 2024

S3 has bucket versioning if you want to have multiple backups. The S3 sync command also does differential backups; if you for example try to run the script over and over it will only upload new/different files.

res0nat0r · on Jan 24, 2024

The major issue out of the box vs any deduping backup software is that S3 doesnt support any deduplication. If you move or rename a 15GB file you're going to have to completely upload it again and also store a second copy and pay for it until your S3 bucket policy purges the previously uploaded file you've deleted. Also aws s3 sync is much slower since it has to iterate over all of the files to see if their size/timestamp has changed. Something like borgbackup is much faster as it uses smarter caching to skip unchanged directories etc.

65 · on Jan 24, 2024

It's possible to find probable duplicate files with the S3 CLI based on size and tags - I was working a script to do just that but I haven't finished it yet. Alternatively if you want exact backups of your computer you can use the --delete flag which will delete files in the bucket that aren't in the source.

I agree this is not the absolute most optimized solution but it does work quite well for me and is easily extendible with other scripts and S3 CLI commands. Theoretically if Borgbackup or Duplicity are backing up to S3 they're using all the same commands as the S3 CLI/SDK.

Besides, shell scripting is fun!

duskwuff · on Jan 25, 2024

> Theoretically if Borgbackup or Duplicity are backing up to S3 they're using all the same commands as the S3 CLI/SDK.

They are not. Both Borg and Duplicity pack files into compressed, encrypted archives before uploading them to S3; "s3 sync" literally just uploads each file as an object with no additional processing.

sevg · on Jan 24, 2024

If I have to choose between hacking together a bunch of shell scripts to do my deduplicated, end-to-end encrypted backups, vs using a popular open source well-tested off the shelf solution, I know which one I'm picking!

rakoo · on Jan 25, 2024

That's not differential backups, that's optimizing the mirroring algorithm.

Differential backup here means that if a file has changed you only send the change delta, not the whole file. This is what makes it possible tobrun that kind of things every hour if needed even on large folders.

If s3 supported that and with the already existing versioning you'd have a pretty kickass solution; that's basically what you can do with rsync and a zfs filesystem for example