Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
Hi Mark, On Sun, 2004-08-15 at 15:23, markw at mohawksoft.com wrote: > Having read the link you submitted, I think there is a common > misunderstanding between the people who think the repeatable compression > is possible and those who don't. This comment reminds me of the old joke, "There are 3 kinds of people in this world, those who understand math and those who don't." > Something like this: > A segment is a run within the stream that can be represented by a compressor. > The stream is a data stream. > > while(stream) > { > best_compressor = NULL; > best_seed = NULL; > best_count = 0; > foreach(compressor) > { > foreach(seed) > { > bytes_represented = compare(stream, compressor, seed); > if(bytes_represented > best_count) > { > best_compressor = compressor; > best_seed = seed; > best_count = bytes_represented; > } > } > } > if(best_count > segment_overhead) > { > put(best_compressor, best_seed, best_count); > stream += best_count; > } > else > { > put(NULL, stream, 1); > stream++; > } > } > > As you can see, with any interesting number of compressors and interesting > range of seeds, this can be HUGELY expensive. It is an interesting theory. > Numerically, it sounds sort of fishy, but if you can get lucky enough, it > works. Since its not relevant to this discussion, lets ignore the origin (whether its a PRNG [1] or some other source) of your n-bit streams and concentrate on the important aspect which is the encoding. Lets imagine that you have enough of these bit streams (in your case, PRNGs) to represent every possible combination of n bits (which is 2^n streams). You then need a way of *remembering* which one of the 2^n streams should be used when you decompress. And to uniquely represent each of the 2^n streams, you will need, at an absolute minimum, an n-bit index. So, in the general case of trying to compress every possible input stream, what have you gained? Absolutely nothing. The above argument can be extended to variable-length streams and I leave that as an exercise for you to work out. Its interesting to note that essentially all of the great (fundamental) discoveries in math and science tend to be "limiting" in the sense that they place some constraint on what can or can't be achieved. For instance: - speed of light (the propagation of information) - 2nd law of thermodynamics (the "arrow of time") - Cramer-Rao Lower Bound (statistics) - Planck constant (Heisenberg and the ability to resolve space-time) - Information Theory (Shannon and maximum transmission rates) These are not the nonsense limits ("couldn't travel faster than 60 MPH") that you mention. In particular, the last (Shannon) is one that you really ought to look into since it should be required reading for anyone playing with compression techniques. Ed [1] PRNG = pseudo-random number generator -- Edward H. Hill III, PhD office: MIT Dept. of EAPS; Rm 54-1424; 77 Massachusetts Ave. Cambridge, MA 02139-4307 emails: eh3 at mit.edu ed at eh3.com URLs: http://web.mit.edu/eh3/ http://eh3.com/ phone: 617-253-0098 fax: 617-253-4464
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |