Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
I've been using rsnapshot for several years now and I'm reasonably familiar with it. It was recently suggested to me to use rdiff-backup to copy files to a FAT32 file system because it is aware of FAT32 and exFAT file name restrictions. Since then I've been experimenting with rdiff-backup. Here are some of the high and low points of the two. rsnapshot is, as the name suggests, a snapshot system. It uses a combination of GNU cp's hard link directory replication and rsync itself to maintain time-based snapshots. It functions similarly to Apple's Time Machine with one notable difference. Where Time Machine's snapshots run back forever until disk runs out then the oldest are pruned to make room, rsnapshot's snapshots are rotated at fixed points: hourly, daily, weekly, monthly, yearly with pruning managed by a retention policy. While I've repeatedly stated -- and still maintain -- that sync is not backup, maintaining many sync-based snapshots is close enough for some uses. When you have many users who want to be able to pluck single files from arbitrary times out of a backup system is when rsnapshot shines. There are two big drawbacks to rsnapshot. The first is setup. It's tedious. You need to configure the increments and retention in a configuration file. You need to match up the increments with associated cron jobs. And you need to make sure that the cron jobs are staggered so that they don't step on each other. rsnapshot is smart enough not to let that be destructive but it can mean missing snapshot runs and that's not good for a backup system. The second is that it is terrible for things like databases that grow forever. Each run will copy an entire database dump or log file or whatever which can lead to massively inflated disk usage. The third -- okay, three big drawbacks -- is that it only works on Unix file systems and their network equivalents. The hard link mechanism won't work on either NTFS or FAT* which makes it unusable for either Windows clients (being backed up) or storage. rdiff-backup, as the name suggests, is a backup mechanism that uses diffs. Specifically, it uses the rsync algorithm to calculate deltas (rdiff) and uses these deltas to build backup histories. Operation is more like Time Machine: each run adds new deltas to the history until you run out of space (at which point the whole thing falls apart) or you invoke a dedicated cleanup run to prune based on relative or absolute time or number of backup runs. As with rsnapshot, sync is not backup but a history of snapshots is close enough. There is practically no setup with rdiff-backup. Everything is command line arguments or external files (e.g., exclude lists) noted in the arguments. This makes a backup script literally a sequence of rdiff-backup commands. As I noted in the introduction, rdiff-backup is smart about escaping characters that are prohibited on target file systems. It also maintains a log of file ownerships and attributes including NTFS ACLs. That's a huge win for disaster recovery. Another win is that because it's based on deltas, and those deltas are compressed, it is vastly more efficient for continuously growing files like databases and logs and VM images. Since the rdiff algorithm is based on rsync it doesn't matter if the files are text or binary data. It's all just bits to rdiff. Now the bad. The big one is that it isn't so obvious how to find a specific file at a specific date and time. Only the most recent backup run is in the target directory. All of the compressed deltas are stored in a subdirectory under the target. Getting at those requires invoking the rdiff-backup command. rdiff-backup runs are slower than comparable rsnapshot runs. Calculating and compressing deltas is more CPU intensive than GNU cp and rsync runs. rdiff-backup's efficiency comes at a price. There they are. Two very different backup systems built on the same rsync algorithm. -- Rich P.
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |