» Synchronize files with rsync

Synchronizing files from one server to another is quite awesome. You can use it for backups, for keeping web servers in sync, and much more. It's fast and it doesn't take up as much bandwidth as normal copying would. And the best thing is, it can be done with only 1 command. Welcome to the wonderful world of rsync.

Installing rsync

On most modern Linux distributions you will find rsync comes preinstalled. If that's not the case, just install it with your package manager. On Ubuntu this would look like:

aptitude -y install rsync

done!

Simple - one command

Let's copy our local /home/kevin/source to /home/kevin/destination which resides on the server: server.example.com:

rsync -raz --progress --size-only /home/kevin/source/* server.example.com:/home/kevin/destination/

explained:

  • -r
    recursive
  • a
    archive, preserves all attributes like ownership, timestamps, etc
  • z
    compress, saves bandwidth but is harder on your CPU so use it for slow/expensive connections only
  • --progress
    shows you the progress of all the files that are being synced
  • --size-only
    compare files based on their size instead of hashes (less CPU, so faster)

Well, that's it! But read on if you want to learn how to automate this!

Advanced - automatic syncing with SSH keys

Alright so syncing files on Linux is pretty easy. But what if we want to automate this? How can we avoid that rsync asks for a password every time?

There are different ways to go about this, but the one I mostly use is installing SSH keys. By installing your SSH key on the destination server, it will recognize you in the future and permit instant access. So this way we can automate the synchronization with rsync.

Easy script

To install your machine's SSH key to another machine, open up a terminal and download & install my instkey.bash script like this:

su -  # If you're going to use the keys to automate tasks, maybe become root first
mkdir -p ~/bin
wget -O- "http://kevin.vanzonneveld.net/download/instkey.bash/" > ~/bin/instkey.bash
chmod 755 ~/bin/instkey.bash

Now install the key on your server

open a terminal and type:

~/bin/instkey.bash server.example.com

If the script asks for a pass phrase, just hit enter twice. Then it asks for the server.example.com login. Type the password, and that's it. Your current machine can now seamlessly login to server.example.com

Did it work?

open a terminal and type:

ssh server.example.com

It should not ask you for any password. Great! this means we can also run rsync directly without logging in! If you need more in depth information on this, I wrote an article on logging in automatically with SSH keys.

Let's create a sync script

So now just create a script /root/bin/syncdata.bash

nano /root/bin/syncdata.bash

that contains your rsync command:

#!/bin/bash
rsync -raz --delete /home/kevin/source/* server.example.com:/home/kevin/destination/

Save the file (CTRL+O) and exit (CTRL+X) and make it executable like this:

chmod a+x /root/bin/syncdata.bash

Schedule it to run every hour

And to have your data synchronized every hour, open up your crontab editor:

crontab -e

And type

0 * * * * /root/bin/syncdata.bash

(if you need more in depth information on crontab I've written another article on scheduling tasks on linux using crontab)

That's it! New files are automatically updated @ server.example.com:/home/kevin/destination/ every hour. Files that are deleted from /home/kevin/source/ are also deleted at the destination, thanks to the --delete parameter. 

Some extra rsync command line options

Some extra arguments that might come in handy customizing your synchronization job:

  • --delete
    delete files remotely that no longer exist locally
  • --dry-run
    show what would have been transferred, but do not transfer anything
  • --max-delete=10
    don't delete more than 10 files in one run, safety precaution
  • --delay-updates
    put all updated files into place at transfer's end, very useful for live systems
  • --compress-level=9
    explicitly set compression level 9. 0 disabled compression
  • --exclude-from=/root/sync_exclude
    specifies a /root/sync_exclude that contains exclude patterns (one per line). filenames matching these patterns will not be tranfered
  • --bwlimit=1024
    This option specifies a maximum transfer rate of 1024 kilobytes per second.

Pitfalls

  • Of course you should really be carefull where and when to install ssh keys, because if one machine is comprimised, it's very easy for a cracker to hop to the next system without logging in. So choose wisely when to use this technology.
  • Keys are user user specific. So if you're going to run programs as root that need to automatically login to systems, you must also install the key as root.

Like this article?

   Then Digg it!
Or use another bookmark button below to show your support &
help me spread the word.


tags: linux, backup, synchronization, ssh, SSH key, rsync, crontab
category: How to - System
read: 12,226 times

Add comment

(required, shown)(required, not shown)for syntax highlighting

[CODE="Javascript"]
your_code_here();
[/CODE]

Replace "Javascript"
with "php", "text", etc.
code (to make sure you are not a spammer)

Comments

#9. bishop on 02 January 2008

Default avatar:bishoprsnapshot is a great tool for simplifying typical backup scenarios using rsync. With rsnapshot, you can create incremental backups, run remote programs before/after backups, etc.
http://www.rsnapshot.org/

Also, you can drop your key on the remote server using one command. As mentioned on http://www.bytejar.com/

... [more] ([ -f .ssh/id_rsa.pub ]||ssh-keygen -t rsa) && ssh user@host "([ -d .ssh ]||mkdir -m 700 .ssh) && cat>>.ssh/authorized_keys && chmod 600 .ssh/authorized_keys" < .ssh/id_rsa.pub

# NOTES:
# replace user@host with the remote user and host you want
# the first time run, this will require you authenticate as user@host.
# subsequent times will be password free.
# this requires that you use a Bourne-derivative shell and GNU mkdir

#8. Kevin (link) on 28 October 2007

Member avatar: Kevin@ Logan: Maybe a rights issue? Otherwise type rsync --help to see if the mac version even supports the --delete option. Maybe it's called differently, I don't really know mac.

#7. Logan on 28 October 2007

Default avatar:LoganI'm using this command on the OSX Terminal program and everything works except the --delete parameter. I've tried with files and directories and it just doesn't want to delete them.

#6. Kevin (link) on 14 October 2007

Member avatar: Kevinmaybe try adding an asterisk to the path you want to exclude like this: /home/jerry/test-rsync/exclude/*

#5. Jerry on 13 October 2007

Default avatar:JerryHi Kelvin,

I am trying to synchronize

/home/jerry/test-rysnc/*
... [more]
except this folder in

/home/jerry/test-rsync/exclude

which will include all its subfolders and files.

I have tried my command but still the folder "exclude" gets synchronized.

I am thinking could it be my badly written command or wrong parameter placement.

Thanks.

#4. Kevin (link) on 12 October 2007

Member avatar: Kevin@ Jerry: In this case your excluding every file/dir that starts with /home/[user]/test-rsync/exclude

So if that's what you want it should work.

#3. Jerry on 11 October 2007

Default avatar:JerryHi,

Can you provide some comment on this instruction

rsync -raz --progress --size-only --exclude=/home/[user]/test-rsync/exclude /home/[user]/test-rsync/* [servername]:/home/[user]/network/administrator/test-rsync/
... [more]
How can I do a correct exclusion?

Thank you.

#2. Kevin (link) on 04 October 2007

Member avatar: Kevin@ Andrew: Then you could use the exclude argument:
--exclude='/home/andrew/remote/movies'

Or create a text file like /root/sync_exclude with all the things you want to exclude separated by newlines and then use:
--exclude-from=/root/sync_exclude

#1. Andrew on 04 October 2007

Default avatar:AndrewWhat if i have network drives mapped in my home dir (/home/andrew/remote/movies) and don't want them being copied?