From: Paul Rubin on 23 Feb 2010 22:51 Lie Ryan writes:> If an attacker knows the that the random number generator have an > extreme skew and he knows the distribution of the letters, how much > advantage would it give the attacker? My initial guess is that the more > skewed the letters are, the better the advantage, since an attacker > using brute-force can write his program to prefer the most likely letters? A useable (conservative) estimate is that the attacker's workload is 1/p where p is the probability of the most likely password. That basically says the password strength can be measured by the min-entropy. Cryptographers often use that approach. If you want to be more precise, you can do a conditional probability calculation assuming the attacker works down the list of possible passwords in order of decreasing probability, stopping when they hit the right one. More generally still, passwords regardless of their entropy content are a sucky way to encapsulate cryptographic secrets. We keep using them because every alternative has drawbacks of its own. From: Lawrence D'Oliveiro on 23 Feb 2010 23:48 In message <7xwry39tpi.fsf(a)ruckus.brouhaha.com>, Paul Rubin wrote: > More generally still, passwords regardless of their entropy content are > a sucky way to encapsulate cryptographic secrets. They're a shared secret. How else would you represent a shared secret, if not with a shared secret? From: Steven D'Aprano on 24 Feb 2010 02:11 On Tue, 23 Feb 2010 18:39:53 -0800, Paul Rubin wrote: > Steven D'Aprano writes: >> Paul, if you were anyone else, I'd be sneering uncontrollably about >> now, but you're not clueless about cryptography, so what have I missed? >> Why is reducing the number of distinct letters by more than 50% >> anything but a disaster? This makes the task of brute-forcing the >> password exponentially easier. > > Reducing the number of distinct letters by 50% decreases the entropy per > character by 1 bit. You say that as if 1 bit of entropy isn't much :) Given a random six character password taken out of an alphabet of 52 characters, it takes over nine billion attempts to brute force it. Reducing the alphabet by 50% cuts that down to less than 200 million. To make up for that loss of 1 bit of entropy, you need two extra characters in your password. > That stuff about mixing letters and digits and > funny symbols just makes the password a worse nuisance to remember and > type, for a small gain in entropy that you can compute and make up for. Well, everybody has their own ways of remembering passwords, and I'd much prefer to remember an eight character password with "funny symbols" that I chose myself, than a six character password with nothing but letters that was chosen for me. Of course, I flatter myself that I know how to choose good passwords, and I hate remembering long random strings even from a reduced alphabet (e.g. I hate memorizing eight digit phone numbers, and am completely incapable of remembering ten digit mobile phone numbers). -- Steven From: Paul Rubin on 24 Feb 2010 02:58 Steven D'Aprano writes:> Given a random six character password taken out of an alphabet of 52 > characters, it takes over nine billion attempts to brute force it. > Reducing the alphabet by 50% cuts that down to less than 200 million. To > make up for that loss of 1 bit of entropy, you need two extra characters > in your password. One extra character comes pretty close (within 1.3 bits). Even two extra chars is probably (subjective) easier for a user to deal with than a completely random mixture of upper/lower case. You don't get the extra bit per character if that distribution is anything other than random, of course. For something like a web password (each guess takes a server hit), where the resource guarded is not very valuable, 5 chars is probably enough for most purposes. For something like an encryption key subject to offline attacks, 6 mixed-case characters will barely slow a real attacker down. As before, my suggestion is still diceware. I've used random alphanumerics in the past but they're too big a hassle, they have to be written down, etc. And of course, if you're doing something serious, use a hardware token. From: mk on 24 Feb 2010 12:23 On 2010-02-24 03:50, Paul Rubin wrote:> The stuff about converting 4 random bytes to a decimal string and then > peeling off 2 digits at a time is pretty awful, and notice that since > 2**32 is 4294967296, in the cases where you get 10 digits, the first > 2-digit pair is never higher than 42. Yikes! I didn't think about that. This is probably where (some part of) probability skewing comes from. Anyway, the passwords for authorized users will be copied and pasted from email into in the application GUI which will remember it for them, so they will not have to remember and type them in. So I have little in the way of limitations of password length - even though in *some* cases somebody might have to (or be ignorant enough) to retype the password instead of pasting it in. In that case the "diceware" approach is not necessary, even though I will certainly remember this approach for a case when users will have to remember & type the passwords in. The main application will access the data using HTTP (probably), so the main point is that an attacker is not able to guess passwords using brute force. Using A-z with 10-char password seems to provide 3 orders of magnitude more combinations than a-z: >>> 57 ** 10 362033331456891249L>>> 25 ** 10 95367431640625L Even though I'm not sure it is worth it, assuming 1000 brute-force guesses per second (which over the web would amount pretty much to DOS), this would take # days: >>> 57 ** 10 / (1000 * 3600 * 24) 4190200595L>>> 25 ** 10 / (1000 * 3600 * 24) 1103789L Even then I'm not getting completely uniform distribution for some reason: d 39411 l 39376 f 39288 a 39275 s 39225 r 39172 p 39159 t 39073 k 39071 u 39064 e 39005 o 39005 n 38995 j 38993 h 38975 q 38958 c 38938 b 38906 g 38894 i 38847 m 38819 v 38712 z 35321 y 35228 w 35189 x 35075 Code: import operator def gen_rand_word(n): with open('/dev/urandom') as f: return ''.join([chr(ord('a') + ord(x) % 26) for x in f.read(n)]) def count_chars(chardict, word): for c in word: try: chardict[c] += 1 except KeyError: chardict[c] = 0 if __name__ == "__main__": chardict = {} for i in range(100000): w = gen_rand_word(10) count_chars(chardict, w) counts = list(chardict.items()) counts.sort(key = operator.itemgetter(1), reverse = True) for char, count in counts: print char, count > I'd write your code something like this: > > nletters = 5 > > def randomword(n): > with open('/dev/urandom') as f: > return ''.join([chr(ord('a')+ord(c)%26) for c in f.read(n)]) > > print randomword(nletters) Aw shucks when will I learn to do the stuff in 3 lines well instead of 20, poorly. :-/ Regards, mk First  |  Prev  |  Next  |  Last Pages: 1 2 3 4 5 6 7 8 9 10 11 Prev: ANN: Leo 4.7 final releasedNext: AKKA vs Python