Boston Linux & UNIX was originally founded in 1994 as part of The Boston Computer Society. We meet on the third Wednesday of each month at the Massachusetts Institute of Technology, in Building E51.

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] vnc



On Thu, Aug 28, 2014 at 6:58 PM, Dan Ritter <dsr at randomstring.org> wrote:
> Suppose we play the game, and I think of a phrase, and you say
> "the magic word is squeamish ossifrage", and purely by chance,
> that is what I was thinking of. Is the entropy zero?

If the game is played among cryptographers, that's such an obvious
guess that it's maybe 2 bits -- Phrase might have been
   "squeamish ossifrage"
   ( without or without the   "the magic word is " prefix)
or
   (anything else)
so two forks in the tree to get there.  2 bits.

(And unspoken assumption, in spoken guessing,
spelling/caps/punctuation don't count.)

In the more general word of Google suggestions, it's a bit more than 2
bits, as it's still second choice after even the s:
    the magic word is s...orry not please
and 'the magic ... word' isn't even top four
   the magic ...(school bus, flute, tree house, of ordinary days),
nor is 'the m..agic' top four
    the m..(m|ountan, m|oth, m|aze runner, m|ountain and the viper).
so each of those choices is > 2 bits, 5 or maybe even 6 bits combined.
I count 9-10 bits?

(before orthographic diddles at +1b each *yawn*.)

And google now lists top choice
    battery ho..rse staple
Publicity has ruined that one, but most other combinations randomly
drawn from GSL or Up Goer Five dialects retain their pitiful entropy.

Use a bigger wordlist than GSL/NGSL eg  /usr/share/dict/words
and get a bit more entropy per word,
   prospectus ampoule rajahs battlefields
   warpaths upraising recheck's shimmed
   pearl's larger begin Learjet
   ventilate vacationing ponder copilots
   rubric's centigram pineapples outpouring's
   borderlands rebound's demagogy Delawareans
   parka lancet rulings lollypops
For my words file with 99171 lines,
log2 of #lines is 16.6 bits.
>48 bits for 3 words, >64 bits for 4 words.

Although rising wordlength will drag entropy per letter back to the
same range as before !
Mean word size is 8.5 chars, after subtracting #Newlines;
16.6/8.5 is back to 1.9 bits per char!

(On this Ubuntu, gzip compresses 'words' file to 20.6 bits per 'word',
which says even though gzip doesn't know English it gets pretty
close.)

-- 
Bill Ricker
bill.n1vux at gmail.com
https://www.linkedin.com/in/n1vux



BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org