Backing up sparse files ... VM's and TrueCrypt ... etc

Edward Ned Harvey blu-Z8efaSeK1ezqlBn2x/YWAg at public.gmane.org
Sun Feb 21 08:51:08 EST 2010


> This hardly seems like a win to have to pass over files twice, first to

> see if they and up taking up more space than the OS reports, and a

> second time to "compact" the strings of zeros. You might as well just

> use a compression filter (gzip, bzip2), which can handle not only

> strings of zeros, but anything else that repeats, and tokenize it on

> the

> fly in a single pass.

> 

> This suggests that the --sparse option is effectively obsolete, given

> the modern practice of almost always compressing tar archives.

 

Exactly what I was just about to say ... after reading that man page.  In
short:

.         Never use --sparse when creating an archive that is compressed.
It's pointless, and doubles the time to create archive.

.         Yes, use --sparse during extraction, if the contents contain a lot
of serial 0's and you want the files restored to a sparse state.

 

Thanks for looking up that detail.  That explains why I had such poor
performance with tar on windows - not because it was windows or cygwin, but
because I was using --sparse and gzip during archive creation.

 

 

> I'd say tar doesn't bring anything useful to this problem, given the

> above info. If you only have one sparse file, you can get the same

> benefit by simply doing:

> 

> gzip -c --rsyncable sparse_file > desparsed_filed

 

One error in your thinking:

 

The man page saying "using '--sparse' is not needed on extraction" is
misleading.  It's technically true - you don't need it - but it's misleading
- yes you need it if you want the files to be extracted sparsely.

 

This, I believe, is the value-add that tar brings to sparsefile backup and
restore, compared to straightup gzip.






More information about the Discuss mailing list