security through obscurity

James R. Van Zandt jrvz at comcast.net
Fri Feb 13 21:45:28 EST 2004


Derek Atkins <derek at ihtfp.com> writes:
>
>I have to wonder where they get the 40GB number -- that just doesn't
>make sense to me.  Honestly I think they are off by an order of
>magnitude somewhere.
>
>Let's assume they are correct in the estimation of 40 million lines
>of code.  Let's further assume that each line is fully 80 characters
>long.  40MM * 80 == 3.2E9 which is just over 3GB of storage (there's
>that order of magnitude).  Considering source code compresses fairly
>easily, I can certainly imagine a compression ratio of 5:1 to get
>down to a CD-rom sized 650MB.

Consider the Linux 2.6.1 kernel sources.  The .bz2 file length is
33240033 bytes, which uncompresses to 174358105 bytes of files (a
compression ratio of 5.2:1) or 5919671 lines (29.4 characters,
including the newline, per line).  

Assuming the same ratios hold for Microsoft sources, 660 MB would
uncompress to about 3.4 GB, or 116 million lines.  That's
substantially more than the "entire 40 million lines of code in the
Windows operating system".

>So this could very well be the full source code in a compressed
>tarfile or a zipfile.

I guess it's big enough for two operating systems (NT + 2000) even if
they don't share any code.

	 - Jim Van Zandt



More information about the Discuss mailing list