Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU |
Thanks for the descriptions Rich, While I did not think that configuring rsnapshot was tedious, and it is reasonably well documented. One issue I had, that you mention, is the scheduling in cron. At work we had a WD MyBook which is a very, very slow device. Our first backup took days. (The WD was connected to the same switch as our NAS). I also had to schedule the rsnapshot backups along with an offsite backup to our New York Office. The one thing I really like about rsnapshot is that it is a snapshot, so if someone trashes a file, it can normally be retrieved quickly. Also, when I update my Linux system, I first make sure that my most current 'hourly' is current. Also, rsnapshot can be used for Windows systems. If rsync is run on a Unix/Linux file system, such as Cygwin, you do get the advantage of hard links. On 12/02/2013 10:26 PM, Richard Pieri wrote: > I've been using rsnapshot for several years now and I'm reasonably > familiar with it. It was recently suggested to me to use rdiff-backup > to copy files to a FAT32 file system because it is aware of FAT32 and > exFAT file name restrictions. Since then I've been experimenting with > rdiff-backup. Here are some of the high and low points of the two. > > > rsnapshot is, as the name suggests, a snapshot system. It uses a > combination of GNU cp's hard link directory replication and rsync > itself to maintain time-based snapshots. It functions similarly to > Apple's Time Machine with one notable difference. Where Time Machine's > snapshots run back forever until disk runs out then the oldest are > pruned to make room, rsnapshot's snapshots are rotated at fixed > points: hourly, daily, weekly, monthly, yearly with pruning managed by > a retention policy. While I've repeatedly stated -- and still maintain > -- that sync is not backup, maintaining many sync-based snapshots is > close enough for some uses. When you have many users who want to be > able to pluck single files from arbitrary times out of a backup system > is when rsnapshot shines. > > There are two big drawbacks to rsnapshot. The first is setup. It's > tedious. You need to configure the increments and retention in a > configuration file. You need to match up the increments with > associated cron jobs. And you need to make sure that the cron jobs are > staggered so that they don't step on each other. rsnapshot is smart > enough not to let that be destructive but it can mean missing snapshot > runs and that's not good for a backup system. > > The second is that it is terrible for things like databases that grow > forever. Each run will copy an entire database dump or log file or > whatever which can lead to massively inflated disk usage. > > The third -- okay, three big drawbacks -- is that it only works on > Unix file systems and their network equivalents. The hard link > mechanism won't work on either NTFS or FAT* which makes it unusable > for either Windows clients (being backed up) or storage. > > > rdiff-backup, as the name suggests, is a backup mechanism that uses > diffs. Specifically, it uses the rsync algorithm to calculate deltas > (rdiff) and uses these deltas to build backup histories. Operation is > more like Time Machine: each run adds new deltas to the history until > you run out of space (at which point the whole thing falls apart) or > you invoke a dedicated cleanup run to prune based on relative or > absolute time or number of backup runs. As with rsnapshot, sync is not > backup but a history of snapshots is close enough. > > There is practically no setup with rdiff-backup. Everything is command > line arguments or external files (e.g., exclude lists) noted in the > arguments. This makes a backup script literally a sequence of > rdiff-backup commands. As I noted in the introduction, rdiff-backup is > smart about escaping characters that are prohibited on target file > systems. It also maintains a log of file ownerships and attributes > including NTFS ACLs. That's a huge win for disaster recovery. > > Another win is that because it's based on deltas, and those deltas are > compressed, it is vastly more efficient for continuously growing files > like databases and logs and VM images. Since the rdiff algorithm is > based on rsync it doesn't matter if the files are text or binary data. > It's all just bits to rdiff. > > Now the bad. The big one is that it isn't so obvious how to find a > specific file at a specific date and time. Only the most recent backup > run is in the target directory. All of the compressed deltas are > stored in a subdirectory under the target. Getting at those requires > invoking the rdiff-backup command. > > rdiff-backup runs are slower than comparable rsnapshot runs. > Calculating and compressing deltas is more CPU intensive than GNU cp > and rsync runs. rdiff-backup's efficiency comes at a price. > > > There they are. Two very different backup systems built on the same > rsync algorithm. > -- Jerry Feldman <gaf at blu.org> Boston Linux and Unix PGP key id:3BC1EB90 PGP Key fingerprint: 49E2 C52A FC5A A31F 8D66 C0AF 7CEA 30FC 3BC1 EB90
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |