Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] On-site backups revisited - rsnapshot vs. CrashPlan



Rich Braun wrote:
> * Supplement rsnapshot with a script to make sha256sum checksums of
>   the archive contents, stored in a simple db table
> 
> I'm not sure how aggressive I have to be with the integrity checking -- I've
> actually never had a known instance of a file getting corrupt -- but I figure
> it's worthwhile for a long-term archive.  Have any of you found or developed
> tools for this part of it, in particular doing it in conjunction with
> rsnapshot or another similar tool?

So the scenario you are trying to protect from is one in which your
source files are good, but your snapshot files get corrupt, while
maintaining original size and timestamp, and thus are not overwritten by
rsync?

I was going to suggest a file integrity checker, like integrit (or
Tripwire, though integrit is lighter weight and probably more easily
repurposed for this), but rsync should be able to do this for you.

Quoting the man page:

  -c, --checksum
    This changes the way rsync checks if the files have been changed and
    are in need of a transfer.  Without this option, rsync uses a "quick
    check" that (by default) checks if each file's size and time of
    last modification match between the sender and receiver. This
    option changes this to compare a 128-bit checksum for each file that
    has a matching size.

This is I/O intensive, so they don't recommend it for every backup run,
but you could do it periodically.

One down side is that a checksum failure will be treated the same as any
normal difference to the last snapshot, and just silently result in a
new copy of the file made and the hardlink broken. You won't get any
warnings from this if your backup medium is making a habit of corrupting
files.

If your source file system supported native snapshots, you could take a
snapshot, run your backup from that, and then do a checksum comparison
against your source snapshot, sending the log of any spotted differences
as an alert. But if your source file system supported native snapshots,
you likely wouldn't be using rsnapshot.


> * Make a tool that makes it more obvious to me whether a given local
>   directory or computer is being backed up

Always a challenge with backups. Probably best done using a tool that is
completely independent of your backup tool.

diff -rq /source /snapshot

is one possibility, but you need to deal with separating the things that
changed since the last snapshot was made from the important differences.

That's something you might do infrequently, while regular checks just
monitor that snapshot dates and size follow the statistical averages for
backups from a given machine. For example, after 5 days with no new
files, trigger an alert.

 -Tom

-- 
Tom Metro
Venture Logic, Newton, MA, USA
"Enterprise solutions through open source."
Professional Profile: http://tmetro.venturelogic.com/



BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org