Wednesday 5 February 2014

RSYNC BACKUP

rsync:
rsync is a file transfer program for Unix systems. rsync uses the "rsync algorithm" which provides a very fast method for bringing remote files into sync. It does this by sending just the differences in the files across the link, without requiring that both sets of files are present at one of the ends of the link beforehand.

Some features of rsync include:

  • can update whole directory trees and filesystems
  • optionally preserves symbolic links, hard links, file ownership, permissions, devices and times
  • requires no special privileges to install
  • internal pipelining reduces latency for multiple files
  • can use rsh, ssh or direct sockets as the transport
  • supports anonymous rsync which is ideal for mirroring

backup to a central backup server with 7 day incremental

#!/bin/sh

# This script does personal backups to a rsync backup server. You will end up
# with a 7 day rotating incremental backup. The incrementals will go
# into subdirectories named after the day of the week, and the current
# full backup goes into a directory called "current"

# directory to backup
BDIR=/home/$USER

# excludes file - this contains a wildcard pattern per line of files to exclude
EXCLUDES=$HOME/cron/excludes

# the name of the backup machine
BSERVER=owl

# your password on the backup server
export RSYNC_PASSWORD=XXXXXX

###############################################

BACKUPDIR=`date +%A`
OPTS="--force --ignore-errors --delete-excluded --exclude-from=$EXCLUDES 
      --delete --backup --backup-dir=/$BACKUPDIR -a"

export PATH=$PATH:/bin:/usr/bin:/usr/local/bin

# the following line clears the last weeks incremental directory
[ -d $HOME/emptydir ] || mkdir $HOME/emptydir
rsync --delete -a $HOME/emptydir/ $BSERVER::$USER/$BACKUPDIR/
rmdir $HOME/emptydir

# now the actual transfer
rsync $OPTS $BDIR $BSERVER::$USER/current

backup to a spare disk
I do local backups on several of my machines using rsync. I have an extra disk installed that can hold all the contents of the main disk. I then have a nightly cron job that backs up the main disk to the backup. This is the script I use on one of those machines.

    #!/bin/sh

    export PATH=/usr/local/bin:/usr/bin:/bin

    LIST="rootfs usr data data2"

    for d in $LIST; do
mount /backup/$d
rsync -ax --exclude fstab --delete /$d/ /backup/$d/
umount /backup/$d
    done

    DAY=`date "+%A"`
    
    rsync -a --delete /usr/local/apache /data2/backups/$DAY
    rsync -a --delete /data/solid /data2/backups/$DAY
   
The first part does the backup on the spare disk. The second part backs up the critical parts to daily directories.  I also backup the critical parts using a rsync over ssh to a remote machine.

mirroring vger CVS tree
The vger.rutgers.edu cvs tree is mirrored onto cvs.samba.org via anonymous rsync using the following script.

    #!/bin/bash

    cd /var/www/cvs/vger/
    PATH=/usr/local/bin:/usr/freeware/bin:/usr/bin:/bin

    RUN=`lps x | grep rsync | grep -v grep | wc -l`
    if [ "$RUN" -gt 0 ]; then
   echo already running
   exit 1
    fi

    rsync -az vger.rutgers.edu::cvs/CVSROOT/ChangeLog $HOME/ChangeLog

    sum1=`sum $HOME/ChangeLog`
    sum2=`sum /var/www/cvs/vger/CVSROOT/ChangeLog`

    if [ "$sum1" = "$sum2" ]; then
   echo nothing to do
   exit 0
    fi

    rsync -az --delete --force vger.rutgers.edu::cvs/ /var/www/cvs/vger/
    exit 0

Note in particular the initial rsync of the ChangeLog to determine if anything has changed. This could be omitted but it would mean that the rsyncd on vger would have to build a complete listing of the cvs area at each run. As most of the time nothing will have changed I wanted to save the time on vger by only doing a full rsync if the ChangeLog has changed. This helped quite a lot because vger is low on memory and generally quite heavily loaded, so doing a listing on such a large tree every hour would have been excessive.

automated backup at home
I use rsync to backup my wifes home directory across a modem link each night. The cron job looks like this

    #!/bin/sh
    cd ~susan
    {
    echo
    date
    dest=~/backup/`date +%A`
    mkdir $dest.new
    find . -xdev -type f \( -mtime 0 -or -mtime 1 \) -exec cp -aPv "{}"
    $dest.new \;
    cnt=`find $dest.new -type f | wc -l`
    if [ $cnt -gt 0 ]; then
      rm -rf $dest
      mv $dest.new $dest
    fi
    rm -rf $dest.new
    rsync -Cavze ssh . samba:backup
    } >> ~/backup/backup.log 2>&1

note that most of this script isn't anything to do with rsync, it just creates a daily backup of Susans work in a ~susan/backup/ directory so she can retrieve any version from the last week. The last line does the rsync of her directory across the modem link to the host samba. Note that I am using the -C option which allows me to add entries to .cvsignore for stuff that doesn't need to be backed up.

Fancy footwork with remote file lists
One little known feature of rsync is the fact that when run over a remote shell (such as rsh or ssh) you can give any shell command as the remote file list. The shell command is expanded by your remote shell before rsync is called. For example, see if you can work out what this does:

rsync -avR remote:'`find /home -name "*.[ch]"`' /tmp/

note that that is backquotes enclosed by quotes (some browsers don't show that correctly).

No comments:

Post a Comment