bzip vs. gzip vs. zip, was Re: security through obscurity

On Sat, 14 Feb 2004, Gregory Boyce wrote:

> Big assumption there.  From what I've seen, bzipped tarballs seem to
> have a much greater compression ratio than zip files.

Depends on the data being compressed -- you really have to test it for
your own data for yourself. In the case of the Windows source tree, I can
only speculate based on what I've seen for source code in the past.  

For some data sets, bzip compression does do significantly better than
gzip or zip files. In other cases, the gain is negligible, or the bzip
version can actually be bigger. 

The one consistent thing seems to be that bzip *always* takes longer to
both compress & decompress. Often, much longer. 

For a typical example of compressing source files, my tests suggest that
bzip might compress to a file 10% than smaller than gzip, but it might
take as much as 50% longer to do the compression, and a similar amount of
additional time to decompress. I don't mean to suggest that these figures
are universal or anything, but in my experience these numbers are a decent
rule of thumb. 

The question then is whether possibly marginally better compression is
worth the extra time tradeoff. It might be, but it's a tradeoff either

Chris Devers

