From: asli on
Hello all,

I want to calculate the strength of the password. But my question is
related to the entropy of the characters. So I have the program that
calculates the frequencies of the symbols, single character, bigrams,
word starting and ending chars.

I want to calculate the entropy of the given password based on these
character probabilities.

I know that the entropy is defined as:
H(X)= - Sum [P(x_i).logP(x_i) ]
for a random variable X, with n, outcomes { x_i : i = 1,... ,n}.

If I want to calculate the entropy of a single character, how will I
use this formula? So I can use it to calculate the entropy of the
password "password" by using:
H(password)=- (P(p).logP(p) + P(a).logP(a) + .... + P(d).logP(d))
so this is equivalent with:
H(password) = H(p) + H(a) + .... + H(d)
that means I can create a table that will store the entropy of each
character and whenever the user enters the password to measure its
strength I would just look to the table and sum the entropies of the
characters in the given password.

Is it the case when you want to calculate the entropy of the password?

But actually if I consider the conditional probabilities it gets more
complicated with the formulas:
H(password) = H(p) + H(a|p) + H(s|pa) + ... + H(d|passwor)
where H(y|x) means the conditional entropy of y where x is given.

So which formula should I use to calculate the entropy?

I hope I am clear enough.

Thanks a lot in advance.

ASLI
From: unruh on
On 2009-12-27, asli <koksal.a(a)gmail.com> wrote:
> Hello all,
>
> I want to calculate the strength of the password. But my question is
> related to the entropy of the characters. So I have the program that
> calculates the frequencies of the symbols, single character, bigrams,
> word starting and ending chars.
>
> I want to calculate the entropy of the given password based on these
> character probabilities.
>
> I know that the entropy is defined as:
> H(X)= - Sum [P(x_i).logP(x_i) ]
> for a random variable X, with n, outcomes { x_i : i = 1,... ,n}.
>
> If I want to calculate the entropy of a single character, how will I
> use this formula? So I can use it to calculate the entropy of the
> password "password" by using:
> H(password)=- (P(p).logP(p) + P(a).logP(a) + .... + P(d).logP(d))
> so this is equivalent with:
> H(password) = H(p) + H(a) + .... + H(d)
> that means I can create a table that will store the entropy of each
> character and whenever the user enters the password to measure its
> strength I would just look to the table and sum the entropies of the
> characters in the given password.
>
> Is it the case when you want to calculate the entropy of the password?
>
> But actually if I consider the conditional probabilities it gets more
> complicated with the formulas:
> H(password) = H(p) + H(a|p) + H(s|pa) + ... + H(d|passwor)
> where H(y|x) means the conditional entropy of y where x is given.
>
> So which formula should I use to calculate the entropy?

None. The question has no unique or even well defined answer.
That formula is only useful if the probabilities are independent. The
entropy is also relative to the "search procedure". Thus it may be that
the searcher has special love for the phrase "ajkl&T)(Salkelkap7uy70 "
in which case the entropy of that phrase would be very low for him.
Another searcher might not. A common Russian word might have high entropy for
an english speaker, but clearly not for a Russian.
Ie, your question is ill defined. Given a search method you could make
an estimate of the "entropy" Eg for an exhaustive search which ran
through the alphabet. ( all strings with less than 15 letters, starting
with "a" then "b" etc) the word zoo would have a very high entropy,
While if the search did 1 letter, then 2 letter, then 3 etc, it would be
low.

>
> I hope I am clear enough.
>
> Thanks a lot in advance.
>
> ASLI
From: rossum on
On Sat, 26 Dec 2009 18:52:53 -0800 (PST), asli <koksal.a(a)gmail.com>
wrote:

>Hello all,
>
>I want to calculate the strength of the password. But my question is
>related to the entropy of the characters.
You might find it easier to pick the strength that you require and
then generate a password/passphrase with that amount of entropy.

I would suggest Diceware:

http://world.std.com/~reinhold/diceware.html

as one possibility.

rossum

From: asli on
On Dec 27, 5:12 am, unruh <un...(a)wormhole.physics.ubc.ca> wrote:
> On 2009-12-27, asli <koksa...(a)gmail.com> wrote:
>
>
>
> > Hello all,
>
> > I want to calculate the strength of the password. But my question is
> > related to the entropy of the characters. So I have the program that
> > calculates the frequencies of the symbols, single character, bigrams,
> > word starting and ending chars.
>
> > I want to calculate the entropy of the given password based on these
> > character probabilities.
>
> > I know that the entropy is defined as:
> > H(X)= - Sum [P(x_i).logP(x_i) ]
> > for a random variable X, with n, outcomes { x_i : i = 1,... ,n}.
>
> > If I want to calculate the entropy of a single character, how will I
> > use this formula? So I can use it to calculate the entropy of the
> > password "password" by using:
> > H(password)=- (P(p).logP(p) + P(a).logP(a) + .... + P(d).logP(d))
> > so this is equivalent with:
> > H(password) = H(p) + H(a) + .... + H(d)
> > that means I can create a table that will store the entropy of each
> > character and whenever the user enters the password to measure its
> > strength I would just look to the table and sum the entropies of the
> > characters in the given password.
>
> > Is it the case when you want to calculate the entropy of the password?
>
> > But actually if I consider the conditional probabilities it gets more
> > complicated with the formulas:
> > H(password) = H(p) + H(a|p) + H(s|pa) + ... + H(d|passwor)
> > where H(y|x) means the conditional entropy of y where x is given.
>
> > So which formula should I use to calculate the entropy?
>
> None. The question has no unique or even well defined answer.
> That formula is only useful if the probabilities are independent. The
> entropy is also relative to the "search procedure". Thus it may be that
> the searcher has special love for the phrase "ajkl&T)(Salkelkap7uy70 "
> in which case the entropy of that phrase would be very low for him.
> Another searcher might not. A common Russian word might have high entropy for
> an english speaker, but clearly not for a Russian.
> Ie, your question is ill defined. Given a search method you could make
> an estimate of the "entropy" Eg for an exhaustive search which ran
> through the alphabet. ( all strings with less than 15 letters, starting
> with "a" then "b" etc) the word zoo would have a very high entropy,
> While if the search did 1 letter, then 2 letter, then 3 etc, it would be
> low.
>
>
>
> > I hope I am clear enough.
>
> > Thanks a lot in advance.
>
> > ASLI



Thanks a lot for your reply. That is the reason why everything gets
complicated. If you check the below link, there exists a strength
checker. The important part for me is the area that shows the entropy.

http://www.certainkey.com/demos/password/

I really wonder how they calculate it. The code is:

function calcEntropy(pswd){
var ai=new Array();
for(var i=0;i<pswd.length;i++){
var c=pswd.charCodeAt(i);
if(ai[c]==undefined)
ai[c]=0;
ai[c]++;
}
entropy=0;
for(var i=0;i<ai.length;i++){
if(ai[i]!=undefined &&ai[i]!=0){
var d=ai[i]/ pswd.length;
entropy+=d * Math.log(1.0 / d);
}
}
entropy /=Math.log(2);
var p=entropy,v=0;
var ret="";
p-=v=Math.floor(p);
p *=10;
ret+=v+".";
p-=v=Math.floor(p);
p *=10;
ret+=v;
p-=v=Math.floor(p);
p *=10;
ret+=v;
return ret;
}


Thanks a lot "rossum". I will check Diceware and comment as soon as
possible.

Greets,
ASLI
From: unruh on
On 2009-12-27, rossum <rossum48(a)coldmail.com> wrote:
> On Sat, 26 Dec 2009 18:52:53 -0800 (PST), asli <koksal.a(a)gmail.com>
> wrote:
>
>>Hello all,
>>
>>I want to calculate the strength of the password. But my question is
>>related to the entropy of the characters.
> You might find it easier to pick the strength that you require and
> then generate a password/passphrase with that amount of entropy.

Or you could try wgen-- (www.theory.physics.ubc.ca/wgen/wgen.c) a crypto
password generator that generates "English" ) or whatever language you
choose) type words (Ie they seem to follow the pronunciation style of
English) with an entropty estimate. They use a dictionary to derive the
trigram and quadrigram frequencies of the letters in the words, and then
randomly generate strings of letters with the same frequencies, together
with an estimate of the probability of getting that particular string of
letters if one generated those lists many many many times.
By default it uses /usr/share/dict/words in Linux.
Any large word list from English would do.


>
> I would suggest Diceware:
>
> http://world.std.com/~reinhold/diceware.html
>
> as one possibility.
>
> rossum
>