Is this secure? [Python]

Prev: ANN: Leo 4.7 final released
Next: AKKA vs Python

From: Lawrence D'Oliveiro on 23 Feb 2010 18:18

In message <mailman.110.1266935711.4577.python-list(a)python.org>, mk wrote:

> I need to generate passwords and I think that pseudo-random generator is
> not good enough, frankly. So I wrote this function:

Much simpler:

import subprocess

data, _ = subprocess.Popen \
(
args = ("pwgen", "-nc"),
stdout = subprocess.PIPE
).communicate()
print data

From: Steven D'Aprano on 23 Feb 2010 21:07

On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote:

> Hello,
>
> I need to generate passwords and I think that pseudo-random generator is
> not good enough, frankly. So I wrote this function:
[snip]
> (yes I know that this way generated string will not contain 'z' because
> 99/4 + 97 = 121 which is 'y')

You're worried about the security of the PRNG but then generate a TWO to
FIVE character lowercase password with no digits, punctuation or the
letter 'z'? That's priceless!

Python's PRNG is not suitable for producing cryptographically strong
streams of random bytes, but it is perfectly strong enough for generating
good passwords.

> The question is: is this secure?

No.

You are wasting your time trying to fix something which isn't a problem,
and introducing a much bigger problem instead. You are MUCH MUCH MUCH
better off with a six or ten character password taken from upper and
lowercase letters, plus digits, plus punctuation, than a four digit
password taken from lowercase letters only. Even if the first case has
some subtle statistical deviation from uniformity, and the second is
"truly random" (whatever that means), it doesn't matter.

Nobody is going to crack your password because the password generator is
0.01% more likely to generate a "G" than a "q". But they *will* brute-
force your password if you have a four digit password taken from a-y only.

> That is, can the string generated this
> way be considered truly random?

Define truly random.

--
Steven

From: Steven D'Aprano on 23 Feb 2010 21:19

On Tue, 23 Feb 2010 11:19:59 -0800, Paul Rubin wrote:

> mk <mrkafk(a)gmail.com> writes:
>> I need to generate passwords and I think that pseudo-random generator
>> is not good enough, frankly. So I wrote this function:... The question
>> is: is this secure? That is, can the string generated this way be
>> considered truly random? (I abstract from not-quite-perfect nature of
>> /dev/urandom at the moment; I can always switch to /dev/random which is
>> better)
>
> urandom is fine and the entropy loss from the numeric conversions and
> eliminating 'z' in that code before you get letters out is not too bad.

What?

You're going from a possible alphabet of 62 (excluding punctuation) or 94
(inc punctuation available on an American keyboard) distinct letters down
to 25, and you say that's "not too bad"?

Paul, if you were anyone else, I'd be sneering uncontrollably about now,
but you're not clueless about cryptography, so what have I missed? Why is
reducing the number of distinct letters by more than 50% anything but a
disaster? This makes the task of brute-forcing the password exponentially
easier.

Add the fact that the passwords are so short (as little as two characters
in my tests) and this is about as far from secure as it is possible to be.

--
Steven

From: Steven D'Aprano on 23 Feb 2010 21:40

On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote:

> The question is: is this secure? That is, can the string generated this
> way be considered truly random?

Putting aside the philosophical question of what "truly random" means, I
presume you mean that the letters are uniformly distributed. The answer
to that is, they don't like uniformly distributed.

This isn't a sophisticated statistical test, it's the equivalent of a
back-of-the-envelope calculation: I generated 100,000 random strings with
your code, and counted how often each letter appears:

If the letters are uniformly distributed, you would expect all the
numbers to be quite close, but instead they range from 15063 to 25679:

{'a': 15063, 'c': 20105, 'b': 15100, 'e': 25465, 'd': 25458, 'g': 25597,
'f': 25589, 'i': 25045, 'h': 25679, 'k': 22945, 'j': 25531, 'm': 16187,
'l': 16252, 'o': 16076, 'n': 16012, 'q': 16069, 'p': 16119, 's': 16088,
'r': 16087, 'u': 15951, 't': 16081, 'w': 16236, 'v': 15893, 'y': 15834,
'x': 15956}

Eye-balling it, it looks vaguely two-humped, one hump around 15-16K, the
second around 22-25K. Sure enough, here's a quick-and-dirty graph:

a | ***********************************
b | ***********************************
c | ***********************************************
d | ***********************************************************
e | ***********************************************************
f | ************************************************************
g | ************************************************************
h | ************************************************************
i | ***********************************************************
j | ************************************************************
k | ******************************************************
l | **************************************
m | **************************************
n | *************************************
o | **************************************
p | **************************************
q | **************************************
r | **************************************
s | **************************************
t | **************************************
u | *************************************
v | *************************************
w | **************************************
x | *************************************
y | *************************************

The mean of the counts is 19056.72, and the mean deviation is 3992.28.
While none of this is statistically sophisticated, it does indicate to me
that your function is nowhere even close to uniform. It has a very strong
bias.

--
Steven

From: Steven D'Aprano on 23 Feb 2010 21:43

On Wed, 24 Feb 2010 02:40:13 +0000, Steven D'Aprano wrote:

> On Tue, 23 Feb 2010 15:36:02 +0100, mk wrote:
>
>> The question is: is this secure? That is, can the string generated this
>> way be considered truly random?
>
> Putting aside the philosophical question of what "truly random" means, I
> presume you mean that the letters are uniformly distributed. The answer
> to that is, they don't like uniformly distributed.

Er, they don't *look* uniformly distributed.

(Of course, being random, perhaps they are and I just got unlucky.)

--
Steven

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11
Prev: ANN: Leo 4.7 final released
Next: AKKA vs Python