From: Paul Rubin on
unruh <unruh(a)wormhole.physics.ubc.ca> writes:
> The user's best strategy is to choose the least likely password from the
> attacker's distribution, while the attacker;s to to choose the most
> likely from the user's distribution. But neither actually do that.

That is a reasonable observation, that the user may have more knowledge
of the attacker's distribution than vice versa. The relevant matter is
that brute searching works best against lousy passwords that weren't
selected by any reasonable strategy.

We've been conflating two different attacks in this thread: 1) attacker
wants to guess Alice's password, where Alice is some specific user; 2)
attacker wants to get one or more valid passwords for system X (which
has 1000's of users) and doesn't care whose passwords they get. Joseph
Ashwood's suggestion of a composite distribution has some validity for
attack 2, where it's enough to find the weakest passwords in the system.
From: Bryan on
unruh wrote:
> It has been widely debated. There is a definition-- the log of the
> shortest program required to produce that sentence as output.

The log of the shortest program? There's the Shannon definition that a
message's entropy is the log of one over its probability, and there's
the Kolmogorov definition that the complexity of a string is the
length of the shortest program that outputs it (with no input). "The
log of the shortest program" sounds like an accidental collision.

The entropy at issue here is Shannon's. Less likely pass-phrases have
more entropy.

Kolmogorov complexity has come up in quite a few sci.crypt threads,
but I've never seen a cryptologic result from it. Are there any?


--
--Bryan Olson
From: Bryan on
Mok-Kong Shen wrote:
[...]
> > BTW, I remember to have seen (maybe in another group) the claim that
> > the measure of entropy only applies to a source of randomness and
> > hence any bit sequence of finite length couldn't be ascribed a value
> > of entropy. Is that o.k.? Any short argument to refute it?
[...]
>
> I would appreciate an answer to the 2nd question.

I've lost count of how many times you've asked trivial variations of
that same question over the years. The answer is still that a
message's entropy is the negative of the log of the message's
probability in the space of all possible messages. You can still find
it explained in Shannon's "Communication Theory of Secrecy Systems".


--
--Bryan
From: Mok-Kong Shen on
Bryan wrote:
> Mok-Kong Shen wrote:
> [...]
>>> BTW, I remember to have seen (maybe in another group) the claim that
>>> the measure of entropy only applies to a source of randomness and
>>> hence any bit sequence of finite length couldn't be ascribed a value
>>> of entropy. Is that o.k.? Any short argument to refute it?
> [...]
>>
>> I would appreciate an answer to the 2nd question.
>
> I've lost count of how many times you've asked trivial variations of
> that same question over the years. The answer is still that a
> message's entropy is the negative of the log of the message's
> probability in the space of all possible messages. You can still find
> it explained in Shannon's "Communication Theory of Secrecy Systems".

I am, as you already know, anytime ready to acknowledge that I am a
layman with poor knowlege/memory (clueless etc.), but I am not very
sure that the argument that I remembered once to have met is clearly
countered by what you wrote above. So let me try to reconstruct that
(it might not correspond to the original, but it doesn't matter for
discussion purposes anyway):

Consider tossing a coin 2 times and getting a result 00. If one knows
(assumes) that the coin is perfect, then one can say that the
probability of getting that kind of result is 1/4. If one knows
(assumes) that the coin is heavily loaded and biased to 0, then one
would say that the probability of getting that kind of result is
near to 1. So the result "as such" (without linkage to the quality of
the coin) doesn't (uniqely) correspond to any probability value.
One the other hand, a coin (the source of randomness) has a definite
probabilty value.

To make a similar argument in the present case, one would go like this:
Suppose the password is to be lower-case 4 characters long and a user
gives 'mary'. Compare the two cases: (1) He uses a perfectly random
mechanism giving each letter a probability of 1/26, in that case this
password has a probability of occuring of (1/26)^4. (2) he is a rather
careless person and lazy etc. and his new girl friend is Mary, in
which case this password has understandably a fairly large probability.
So the password 'mary' as such "alone" cannot be ascribed any
probability value.

All this seems to argue for what I referred to previously, namely that
the entropy concept can only be applied to the "source" of randomness
and not to a finite string that one has at hand. As said, I don't claim
that it is o.k. (it's not my argument!). I only want to know how to
clearly "refute" that argument, if it is indeed wrong.

M. K. Shen


From: Mok-Kong Shen on
jbriggs444 wrote:

> Suppose your strategy is to filter out all 26 passwords where the
> same character is repeated 10 times. Suppose the password
> generator otherwise has a 26 character alphabet and is generating
> 10 character passwords.
>
>
> By filtering the password generator's output, you reduce its entropy
> [very slightly] and improve the odds that an attacker who understands
> your password generation algorithm will be able to mount a brute
> force attack.
>
> You ease this attacker's work factor by a factor from an average of
> (2^26+1)/2 down to an average of (2^26-25)/2
>
>
> Against an attacker who is starting at aaaaaaaaaa and working up
> in alphabetic order, the filter achieves no net change in the
> attacker's work factor!!! [you eliminated "zzzzzzzzzz" from the
> end of the list at the same time you eliminated "aaaaaaaaaa" from
> the front -- average linear search time is unchanged]

This leads to the question of how to build a good filter, i.e. which
kinds of patterns are to be disallowed. Has that been well studied?

Thanks.

M. K. Shen