From: Paul Rubin on
Lie Ryan <lie.1296(a)gmail.com> writes:
> If an attacker knows the that the random number generator have an
> extreme skew and he knows the distribution of the letters, how much
> advantage would it give the attacker? My initial guess is that the more
> skewed the letters are, the better the advantage, since an attacker
> using brute-force can write his program to prefer the most likely letters?

A useable (conservative) estimate is that the attacker's workload is 1/p
where p is the probability of the most likely password. That basically
says the password strength can be measured by the min-entropy.
Cryptographers often use that approach. If you want to be more precise,
you can do a conditional probability calculation assuming the attacker
works down the list of possible passwords in order of decreasing
probability, stopping when they hit the right one.

More generally still, passwords regardless of their entropy content are
a sucky way to encapsulate cryptographic secrets. We keep using them
because every alternative has drawbacks of its own.
From: Lawrence D'Oliveiro on
In message <7xwry39tpi.fsf(a)ruckus.brouhaha.com>, Paul Rubin wrote:

> More generally still, passwords regardless of their entropy content are
> a sucky way to encapsulate cryptographic secrets.

They're a shared secret. How else would you represent a shared secret, if
not with a shared secret?
From: Steven D'Aprano on
On Tue, 23 Feb 2010 18:39:53 -0800, Paul Rubin wrote:

> Steven D'Aprano <steven(a)REMOVE.THIS.cybersource.com.au> writes:
>> Paul, if you were anyone else, I'd be sneering uncontrollably about
>> now, but you're not clueless about cryptography, so what have I missed?
>> Why is reducing the number of distinct letters by more than 50%
>> anything but a disaster? This makes the task of brute-forcing the
>> password exponentially easier.
>
> Reducing the number of distinct letters by 50% decreases the entropy per
> character by 1 bit.

You say that as if 1 bit of entropy isn't much :)

Given a random six character password taken out of an alphabet of 52
characters, it takes over nine billion attempts to brute force it.
Reducing the alphabet by 50% cuts that down to less than 200 million. To
make up for that loss of 1 bit of entropy, you need two extra characters
in your password.


> That stuff about mixing letters and digits and
> funny symbols just makes the password a worse nuisance to remember and
> type, for a small gain in entropy that you can compute and make up for.

Well, everybody has their own ways of remembering passwords, and I'd much
prefer to remember an eight character password with "funny symbols" that
I chose myself, than a six character password with nothing but letters
that was chosen for me.

Of course, I flatter myself that I know how to choose good passwords, and
I hate remembering long random strings even from a reduced alphabet (e.g.
I hate memorizing eight digit phone numbers, and am completely incapable
of remembering ten digit mobile phone numbers).



--
Steven
From: Paul Rubin on
Steven D'Aprano <steven(a)REMOVE.THIS.cybersource.com.au> writes:
> Given a random six character password taken out of an alphabet of 52
> characters, it takes over nine billion attempts to brute force it.
> Reducing the alphabet by 50% cuts that down to less than 200 million. To
> make up for that loss of 1 bit of entropy, you need two extra characters
> in your password.

One extra character comes pretty close (within 1.3 bits). Even two
extra chars is probably (subjective) easier for a user to deal with than
a completely random mixture of upper/lower case. You don't get the
extra bit per character if that distribution is anything other than
random, of course.

For something like a web password (each guess takes a server hit), where
the resource guarded is not very valuable, 5 chars is probably enough
for most purposes. For something like an encryption key subject to
offline attacks, 6 mixed-case characters will barely slow a real
attacker down.

As before, my suggestion is still diceware. I've used random
alphanumerics in the past but they're too big a hassle, they have to be
written down, etc.

And of course, if you're doing something serious, use a hardware token.
From: mk on
On 2010-02-24 03:50, Paul Rubin wrote:
> The stuff about converting 4 random bytes to a decimal string and then
> peeling off 2 digits at a time is pretty awful, and notice that since
> 2**32 is 4294967296, in the cases where you get 10 digits, the first
> 2-digit pair is never higher than 42.

Yikes! I didn't think about that. This is probably where (some part of)
probability skewing comes from.

Anyway, the passwords for authorized users will be copied and pasted
from email into in the application GUI which will remember it for them,
so they will not have to remember and type them in. So I have little in
the way of limitations of password length - even though in *some* cases
somebody might have to (or be ignorant enough) to retype the password
instead of pasting it in.

In that case the "diceware" approach is not necessary, even though I
will certainly remember this approach for a case when users will have to
remember & type the passwords in.

The main application will access the data using HTTP (probably), so the
main point is that an attacker is not able to guess passwords using
brute force.

Using A-z with 10-char password seems to provide 3 orders of magnitude
more combinations than a-z:

>>> 57 ** 10
362033331456891249L
>>> 25 ** 10
95367431640625L

Even though I'm not sure it is worth it, assuming 1000 brute-force
guesses per second (which over the web would amount pretty much to DOS),
this would take # days:

>>> 57 ** 10 / (1000 * 3600 * 24)
4190200595L
>>> 25 ** 10 / (1000 * 3600 * 24)
1103789L

Even then I'm not getting completely uniform distribution for some reason:

d 39411
l 39376
f 39288
a 39275
s 39225
r 39172
p 39159
t 39073
k 39071
u 39064
e 39005
o 39005
n 38995
j 38993
h 38975
q 38958
c 38938
b 38906
g 38894
i 38847
m 38819
v 38712
z 35321
y 35228
w 35189
x 35075

Code:

import operator

def gen_rand_word(n):
with open('/dev/urandom') as f:
return ''.join([chr(ord('a') + ord(x) % 26) for x in f.read(n)])

def count_chars(chardict, word):
for c in word:
try:
chardict[c] += 1
except KeyError:
chardict[c] = 0

if __name__ == "__main__":
chardict = {}
for i in range(100000):
w = gen_rand_word(10)
count_chars(chardict, w)
counts = list(chardict.items())
counts.sort(key = operator.itemgetter(1), reverse = True)
for char, count in counts:
print char, count

> I'd write your code something like this:
>
> nletters = 5
>
> def randomword(n):
> with open('/dev/urandom') as f:
> return ''.join([chr(ord('a')+ord(c)%26) for c in f.read(n)])
>
> print randomword(nletters)

Aw shucks when will I learn to do the stuff in 3 lines well instead of
20, poorly. :-/

Regards,
mk



First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9 10 11
Prev: ANN: Leo 4.7 final released
Next: AKKA vs Python