Boston Linux & UNIX was originally founded in 1994 as part of The Boston Computer Society. We meet on the third Wednesday of each month at the Massachusetts Institute of Technology, in Building E51.

BLU Discuss list archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] vnc => passphrase entropy

[changing subject line in case this continues further]

On Fri, Aug 29, 2014 at 11:14 AM, Edward Ned Harvey (blu)
<blu at> wrote:
> This would mean that each word in a sentence is 0.67 times as random as a perfectly random word.
> I don't buy it.
>I swear that measurement is grossly overestimated.


A sentence *making sense* necessarily means it lacks entropy compared
to a random draw from the poetry magnets / GSL / 'words' file of same
number of words (or draw of words of same total characters).
     As a result, XKCD's infamous nonsense 'HORSE BATTERY STAPLE' had
many more bits than a sentence of as many words or characters simply
simply because it is NOT a sentence ... i say had, until it became a
trope; now that singular nonsense phrase has FEWER bits than any
non-clich? sentence. (but that doesn't affect unpublished random draws
from the same lexicon.)

Sentential lack of entropy can be demonstrated with Google search
predict (in an incognito browser to avoid search history 'bubble'), as
in my previous 'contributions' to this thread, where I was getting
10-16 bits for the whole sentence "yo(ur)? (ma|mama|mom|momma|mother)
wears (army|combat) boots" sentence and 9-11 for "(the magic word(
is|s are))? squeamish ossifrage", hardly 11 bits per word !

A more rigorous demonstration would require instrumenting a
'Disassociated Press'  Markov-chain class of Travesty generator to
dump its matrix after scanning a corpus for "(start|word):(word|end)"
successor frequency, to determine the contingent probability of each
word's appearance in *local* context.  A lower (better, tighter)
upper-bound on the sentence entropy stems from the product of the
markov transition probabilities of that sentence.

This markov-entropy would still vastly over-estimate the entropy of
any (sensible, true) sentence, as the likelihood of  of X='boots'
depends not *just* on p($X='boots' | 'army $X') context, but total
context of 'Your mother wears army X'. The sequences 'army
regulations' may appear very commonly in the corpus and have high
probability / low entropy, but 'Your mother wears army regulations'
makes no sense.
    Context for sense even extends beyond the sentence to require
agreement with the topic and tone of larger discourse before *and
after* this one sentence (boots<=>insulting/hostile,
fatigues<=>factual, dress-uni,medals,ink<=>admiring).  Eliminating all
Markov transitions that destroy sense is classic Linguistic AI, and
would ruin the humor of the travesty generator, but would better
estimate the actual entropy ... it would be very low.

NET NET -- Sentences are poor secrets, even if not clich?s /
quotations that are easily harvestable.
("11Saps,einc/qtaeh" would be at least as good a password as the above
sentence with same mnemonic.)

Question. Do any of the offline password cracker suites have a Markov
sentence generator?
(if not why not?)
(I'm pretty sure they already have a list of clich?/quotes for
pass-phrases. They should have harvested IMDB and Bartletts quotes at
the very least ! )

Bill Ricker
bill.n1vux at

BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!

Boston Linux & Unix /