Hacker News new | past | comments | ask | show | jobs | submit login

Agreed! but rsync isn't nearly as fast over the network as something like this. rsync has to track tons of metadata (and the bigger the tree, the bigger the slowdown). Tar just screams (try it and compare):

tar -zc /somedir/ | ssh remotebox tar -xzC /somedir/

(you can play with -z vs ssh -C, but my experience is that -z is faster because it compresses before passing across the wire; don't do both, though..)

I've also had really great luck doing this on a single filesystem (tar into tar), since it does a better job with things like /dev compared to cp.

rsync has a lot of other awesomeness, though, especially with its hard-link capabilities. That can save a huge amount of space on remote backups over multiple days. (We use glacier for backups at Userify (ssh key management), but we also use this trick for remote pull backup to the backups without server access.)




The original posts were about progress indication on the file transfers, so I would add an invocation of pv in your example:

    tar -zc /somedir/ | pv | ssh remotebox tar -xzC /somedir/
That won't give a % done as it won't know the total size, but if you scan the source first and split out the tar and compress stages you can achieve that though it'll be a little less efficient:

    tar -c somedir | pv --size `du somedir -bs | cut -f1` | ssh -C remotebox tar -xC /tmp/
and if you want to keep the compression via gzip instead of ssh (I'm not sure that'll make any difference myself as ssh IIRC uses the same algorithm, though I've not benchmarked it at all):

    tar -c somedir | pv --size `du somedir -bs | cut -f1` | gzip | ssh remotebox tar -xzC /somedir/




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: