Backing up sparse files ... VM's and TrueCrypt ... etc
Edward Ned Harvey
blu-Z8efaSeK1ezqlBn2x/YWAg at public.gmane.org
Sun Feb 21 08:51:08 EST 2010
> This hardly seems like a win to have to pass over files twice, first to
> see if they and up taking up more space than the OS reports, and a
> second time to "compact" the strings of zeros. You might as well just
> use a compression filter (gzip, bzip2), which can handle not only
> strings of zeros, but anything else that repeats, and tokenize it on
> the
> fly in a single pass.
>
> This suggests that the --sparse option is effectively obsolete, given
> the modern practice of almost always compressing tar archives.
Exactly what I was just about to say ... after reading that man page. In
short:
. Never use --sparse when creating an archive that is compressed.
It's pointless, and doubles the time to create archive.
. Yes, use --sparse during extraction, if the contents contain a lot
of serial 0's and you want the files restored to a sparse state.
Thanks for looking up that detail. That explains why I had such poor
performance with tar on windows - not because it was windows or cygwin, but
because I was using --sparse and gzip during archive creation.
> I'd say tar doesn't bring anything useful to this problem, given the
> above info. If you only have one sparse file, you can get the same
> benefit by simply doing:
>
> gzip -c --rsyncable sparse_file > desparsed_filed
One error in your thinking:
The man page saying "using '--sparse' is not needed on extraction" is
misleading. It's technically true - you don't need it - but it's misleading
- yes you need it if you want the files to be extracted sparsely.
This, I believe, is the value-add that tar brings to sparsefile backup and
restore, compared to straightup gzip.
More information about the Discuss
mailing list