[Discuss] vnc

Bill Ricker bill.n1vux at gmail.com
Thu Aug 28 20:00:38 EDT 2014


On Thu, Aug 28, 2014 at 6:58 PM, Dan Ritter <dsr at randomstring.org> wrote:
> Suppose we play the game, and I think of a phrase, and you say
> "the magic word is squeamish ossifrage", and purely by chance,
> that is what I was thinking of. Is the entropy zero?

If the game is played among cryptographers, that's such an obvious
guess that it's maybe 2 bits -- Phrase might have been
   "squeamish ossifrage"
   ( without or without the   "the magic word is " prefix)
or
   (anything else)
so two forks in the tree to get there.  2 bits.

(And unspoken assumption, in spoken guessing,
spelling/caps/punctuation don't count.)

In the more general word of Google suggestions, it's a bit more than 2
bits, as it's still second choice after even the s:
    the magic word is s...orry not please
and 'the magic ... word' isn't even top four
   the magic ...(school bus, flute, tree house, of ordinary days),
nor is 'the m..agic' top four
    the m..(m|ountan, m|oth, m|aze runner, m|ountain and the viper).
so each of those choices is > 2 bits, 5 or maybe even 6 bits combined.
I count 9-10 bits?

(before orthographic diddles at +1b each *yawn*.)

And google now lists top choice
    battery ho..rse staple
Publicity has ruined that one, but most other combinations randomly
drawn from GSL or Up Goer Five dialects retain their pitiful entropy.

Use a bigger wordlist than GSL/NGSL eg  /usr/share/dict/words
and get a bit more entropy per word,
   prospectus ampoule rajahs battlefields
   warpaths upraising recheck's shimmed
   pearl's larger begin Learjet
   ventilate vacationing ponder copilots
   rubric's centigram pineapples outpouring's
   borderlands rebound's demagogy Delawareans
   parka lancet rulings lollypops
For my words file with 99171 lines,
log2 of #lines is 16.6 bits.
>48 bits for 3 words, >64 bits for 4 words.

Although rising wordlength will drag entropy per letter back to the
same range as before !
Mean word size is 8.5 chars, after subtracting #Newlines;
16.6/8.5 is back to 1.9 bits per char!

(On this Ubuntu, gzip compresses 'words' file to 20.6 bits per 'word',
which says even though gzip doesn't know English it gets pretty
close.)

-- 
Bill Ricker
bill.n1vux at gmail.com
https://www.linkedin.com/in/n1vux



More information about the Discuss mailing list