From: J.D. on
On Apr 6, 10:48 pm, "Scott Fluhrer" <sfluh...(a)ix.netcom.com> wrote:
> "David T. Ashley" <dash...(a)gmail.com> wrote in messagenews:98WdnZxwQIpmOibWnZ2dnUVZ_jSdnZ2d(a)giganews.com...
>
>
>
> > Assume that I have an SHA-512 hash of a file (512 bits).  Assume that I
> > want to find a second file with the same hash and can't attack the hash
> > function (i.e. I can only repeatedly try new files).
>
> > This is harder than a birthday attack, right?  The mathematics of a
> > birthday attack
>
> >http://en.wikipedia.org/wiki/Birthday_attack
>
> > are results for finding a collision in a set of attempts.  Having a single
> > hash already and trying to match it is harder, right (you are looking for
> > a SPECIFIC collision rather than any collision)?
>
> > And what is the expected value of the number of attempts until a match is
> > found?  I want to say approximately 2^511, but that would be too simple
> > ....
>
> Actually, if we model SHA-512 as a random oracle, the expected number of
> attempts is actually 2^512.  This is because SHA-512 is a function, not a
> permutation, and so finding one value that doesn't hash to the right value
> doesn't make any other value apriori more likely to succeed (as opposed to,
> say, searching for a plaintext block that AES encrypts with an unknown key
> to a specific ciphertext block; we know that we'll come across the right one
> after a maximum of 2^128 attempts, and so any failed attempt makes the next
> attempt more likely to happen to be value you're searching for).
>
> --
> poncho

@Scott Fluhrer

Well, my calculator doesn't have the sufficient degree of precision
for finding the odds of collisions using 512 bit hash functions, but
for 256 bit hash functions I get a slightly different result from what
you describe:

Assumption: For any randomly selected message, mx, the probability
that h(mx) will collide with a specific 256 bit digest, h(ma), is 1 in
2^256. Additional assumption: the number of possible messages that
can be hashed is infinite, or so gargantuan as to be effectively
infinite, such that failing to produce a collision with one random
message does not increase the collision odds of any other randomly
selected message.

Under these assumptions, the odds that a randomly chosen message will
NOT collide with a specific digest is thus 1-(2^-256). And the odds
that _not even one_ out of y many randomly chosen messages will
collide with the target digest is {1-(2^-256)}^y. So, if my math is
right (and it's one o'clock in the morning, so that's a pretty big
if), then there is a 63% chance with 2^256 randomly chosen messages
that at least one of them will collide with the target digest. With
just 2^255 randomly chosen messages there is only a 39% chance of a
collision. So to have a 50% chance of finding a collision with a
specific digest you need to hash slightly less than 2^256 random
messages (approximately 2^255.5). Presumably there is a similar
result for 512-bit hash functions, though I cannot confirm that with
my crappy equipment. Also I may be completely off in my assumptions
and/or calculations -- it wouldn't be the first time, and certainly
won't be the last.
From: unruh on
On 2010-04-07, Jonathan Degolyer <degolyer181(a)gmail.com> wrote:
> On Apr 6, 10:48?pm, "Scott Fluhrer" <sfluh...(a)ix.netcom.com> wrote:
>> "David T. Ashley" <dash...(a)gmail.com> wrote in messagenews:98WdnZxwQIpmOibWnZ2dnUVZ_jSdnZ2d(a)giganews.com...
>>
>>
>>
>> > Assume that I have an SHA-512 hash of a file (512 bits). ?Assume that I
>> > want to find a second file with the same hash and can't attack the hash
>> > function (i.e. I can only repeatedly try new files).
>>
>> > This is harder than a birthday attack, right? ?The mathematics of a
>> > birthday attack
>>
>> >http://en.wikipedia.org/wiki/Birthday_attack
>>
>> > are results for finding a collision in a set of attempts. ?Having a single
>> > hash already and trying to match it is harder, right (you are looking for
>> > a SPECIFIC collision rather than any collision)?
>>
>> > And what is the expected value of the number of attempts until a match is
>> > found? ?I want to say approximately 2^511, but that would be too simple
>> > ....
>>
>> Actually, if we model SHA-512 as a random oracle, the expected number of
>> attempts is actually 2^512. ?This is because SHA-512 is a function, not a
>> permutation, and so finding one value that doesn't hash to the right value
>> doesn't make any other value apriori more likely to succeed (as opposed to,
>> say, searching for a plaintext block that AES encrypts with an unknown key
>> to a specific ciphertext block; we know that we'll come across the right one
>> after a maximum of 2^128 attempts, and so any failed attempt makes the next
>> attempt more likely to happen to be value you're searching for).
>>
>> --
>> poncho
> @Scott Fluhrer
>
> Well, my calculator doesn't have the sufficient degree of precision
> for finding the odds of collisions using 512 bit hash functions, but
> for 256 bit hash functions I get a slightly different result from what
> you describe:
>
> Assumption: For any randomly selected message, mx, the probability
> that h(mx) will collide with a specific 256 bit digest, h(ma), is 1 in
> 2^256. Additional assumption: the number of possible messages that
> can be hashed is infinite, or so gargantuan as to be effectively
> infinite, such that failing to produce a collision with one random
> message does not increase the collision odds of any other randomly
> selected message.
>
> Under these assumptions, the odds that a randomly chosen message will
> NOT collide with a specific digest is thus 1-(2^-256). And the odds
> that _not even one_ out of y many randomly chosen messages will
> collide with the target digest is {1-(2^-256)}^y. So, if my math is
> right (and it's one o'clock in the morning, so that's a pretty big
> if), then there is a 63% chance with 2^256 randomly chosen messages
> that at least one of them will collide with the target digest. With
> just 2^255 randomly chosen messages there is only a 39% chance of a
> collision. So to have a 50% chance of finding a collision with a
> specific digest you need to hash slightly less than 2^256 random
> messages (approximately 2^255.5). Presumably there is a similar
> result for 512-bit hash functions, though I cannot confirm that with
> my crappy equipment. Also I may be completely off in my assumptions
> and/or calculations -- it wouldn't be the first time, and certainly
> won't be the last.

The question was, what is the mean number of attempts you would have to
do to find a collision.
As you say the probability of not having the first N not collide is
(1-1/2^512)^N, but the probability that one will then get a collision is
1/2^512, so the probability of not colliding in N-1 attempts and then
colliding on the Nth is [(1-1/2^512)^(N-1)]/2^512. Thus the mean number is
SUM_N N (1-1/2^512)^(N-1) 1/2^512 =SUM_N -(d/dx) [(1-x/2^512)]^N|_(x=1) =
-d/dx (1/(1-(1-x/2^512)))|_(x=1) =-d/dx(2^512/x)|_(x=1) =2^512

While it is true that the probability of finding a collision by 2^512
tries is 2/e=.7, that was not the question.

>
From: Thomas Pornin on
According to robertwessel2(a)yahoo.com <robertwessel2(a)yahoo.com>:
> (IIRC, an MD5 collision is now possible in less than 2**52
> operations).

Your statement is technically correct but your estimation is larger than
necessary. A "raw" collision is achieved in an average of 2^26.5
operations (where "one operation" is "one evaluation of the MD5
compression function over a single block"). This translates to about
14.7 seconds on a 2.4 GHz Intel Core2 PC, in 64-bit mode (using a single
core). This was measured (by me) over 210000 collisions, using this
code:
http://www.crypto-hash.fr/modules/wfdownloads/singlefile.php?cid=13&lid=14

That's a rewrite of Vlastimil Klima's code, made portable [in
particular to 64-bit architectures] and slightly more optimized, with
comments.


The cost for the "colliding certificates" was substantially higher than
those 2^26.5, because they had to work within constraints: the data had
to match the format of a certificate, with a RSA public key for which
they knew the corresponding private key.


--Thomas Pornin
From: J.D. on
On Apr 7, 2:49 am, unruh <un...(a)wormhole.physics.ubc.ca> wrote:
> On 2010-04-07, Jonathan Degolyer <degolyer...(a)gmail.com> wrote:
>
>
>
> > On Apr 6, 10:48?pm, "Scott Fluhrer" <sfluh...(a)ix.netcom.com> wrote:
> >> "David T. Ashley" <dash...(a)gmail.com> wrote in messagenews:98WdnZxwQIpmOibWnZ2dnUVZ_jSdnZ2d(a)giganews.com...
>
> >> > Assume that I have an SHA-512 hash of a file (512 bits). ?Assume that I
> >> > want to find a second file with the same hash and can't attack the hash
> >> > function (i.e. I can only repeatedly try new files).
>
> >> > This is harder than a birthday attack, right? ?The mathematics of a
> >> > birthday attack
>
> >> >http://en.wikipedia.org/wiki/Birthday_attack
>
> >> > are results for finding a collision in a set of attempts. ?Having a single
> >> > hash already and trying to match it is harder, right (you are looking for
> >> > a SPECIFIC collision rather than any collision)?
>
> >> > And what is the expected value of the number of attempts until a match is
> >> > found? ?I want to say approximately 2^511, but that would be too simple
> >> > ....
>
> >> Actually, if we model SHA-512 as a random oracle, the expected number of
> >> attempts is actually 2^512. ?This is because SHA-512 is a function, not a
> >> permutation, and so finding one value that doesn't hash to the right value
> >> doesn't make any other value apriori more likely to succeed (as opposed to,
> >> say, searching for a plaintext block that AES encrypts with an unknown key
> >> to a specific ciphertext block; we know that we'll come across the right one
> >> after a maximum of 2^128 attempts, and so any failed attempt makes the next
> >> attempt more likely to happen to be value you're searching for).
>
> >> --
> >> poncho
> > @Scott Fluhrer
>
> > Well, my calculator doesn't have the sufficient degree of precision
> > for finding the odds of collisions using 512 bit hash functions, but
> > for 256 bit hash functions I get a slightly different result from what
> > you describe:
>
> > Assumption: For any randomly selected message, mx, the probability
> > that h(mx) will collide with a specific 256 bit digest, h(ma), is 1 in
> > 2^256.  Additional assumption: the number of possible messages that
> > can be hashed is infinite, or so gargantuan as to be effectively
> > infinite, such that failing to produce a collision with one random
> > message does not increase the collision odds of any other randomly
> > selected message.
>
> > Under these assumptions, the odds that a randomly chosen message will
> > NOT collide with a specific digest is thus 1-(2^-256).  And the odds
> > that _not even one_ out of y many randomly chosen messages will
> > collide with the target digest is {1-(2^-256)}^y.  So, if my math is
> > right (and it's one o'clock in the morning, so that's a pretty big
> > if), then there is a 63% chance with 2^256 randomly chosen messages
> > that at least one of them will collide with the target digest.  With
> > just 2^255 randomly chosen messages there is only a 39% chance of a
> > collision.  So to have a 50% chance of finding a collision with a
> > specific digest you need to hash slightly less than 2^256 random
> > messages (approximately 2^255.5).  Presumably there is a similar
> > result for 512-bit hash functions, though I cannot confirm that with
> > my crappy equipment.  Also I may be completely off in my assumptions
> > and/or calculations -- it wouldn't be the first time, and certainly
> > won't be the last.
>
> The question was, what is the mean number of attempts you would have to
> do to find a collision.
>
> While it is true that the probability of finding a collision by 2^512
> tries is 2/e=.7, that was not the question.
>

OK, I guess I misunderstood the question...

> As you say the probability of not having the first N not collide is
> (1-1/2^512)^N, but the probability that one will then get a collision is
> 1/2^512, so the probability of not colliding in N-1 attempts and then
> colliding on the Nth is [(1-1/2^512)^(N-1)]/2^512. Thus the mean number is
> SUM_N N (1-1/2^512)^(N-1) 1/2^512 =SUM_N -(d/dx) [(1-x/2^512)]^N|_(x=1) =
> -d/dx (1/(1-(1-x/2^512)))|_(x=1) =-d/dx(2^512/x)|_(x=1)  =2^512

I am having a hard time following your calculations (I am not used to
mathematical notation expressed in ASCII). However, are you
essentially saying that if we think of a brute force preimage attack
on a hash function as conducting a series of independent Bernoulli
trials, then in order to have an expected value of 1 (i.e. np = 1,
where n = no. of trials and p = probability of success for each trial)
then if p = 2^-512 then n must = 2^512. It that correct? Or am I off
on the wrong track again?
From: Scott Fluhrer on

"J.D." <degolyer181(a)yahoo.com> wrote in message
news:3a2ed776-bd41-4cbd-b010-2d3affe00f3e(a)20g2000vbr.googlegroups.com...
On Apr 7, 2:49 am, unruh <un...(a)wormhole.physics.ubc.ca> wrote:

>> As you say the probability of not having the first N not collide is
>> (1-1/2^512)^N, but the probability that one will then get a collision is
>> 1/2^512, so the probability of not colliding in N-1 attempts and then
>> colliding on the Nth is [(1-1/2^512)^(N-1)]/2^512. Thus the mean number
>> is
>> SUM_N N (1-1/2^512)^(N-1) 1/2^512 =SUM_N -(d/dx) [(1-x/2^512)]^N|_(x=1) =
>> -d/dx (1/(1-(1-x/2^512)))|_(x=1) =-d/dx(2^512/x)|_(x=1) =2^512
>
> I am having a hard time following your calculations (I am not used to
> mathematical notation expressed in ASCII). However, are you
> essentially saying that if we think of a brute force preimage attack
> on a hash function as conducting a series of independent Bernoulli
> trials, then in order to have an expected value of 1 (i.e. np = 1,
> where n = no. of trials and p = probability of success for each trial)
> then if p = 2^-512 then n must = 2^512. It that correct? Or am I off
> on the wrong track again?

Well, I believe you're getting closer, but you're not quite there yet.
You're actually answering a different question that just happens to have the
same answer.

For what 'expected value' is, see
http://en.wikipedia.org/wiki/Expected_Value (which explains it better than I
can hope to, and especially look in the Examples section).

In this case, well, if we call the probability of a single query succeeding
p=2^-512, and q=1-p, then the probability of us finding a hit (that is,
something hashed to the value we're looking for) on the k-th query is:

q^(k-1) * p

We have a factor of q^(k-1) there because if we got a hit before the k'th
query, we would have stopped then, and so the k'th query would never have
happened; hence what is neccessary and sufficient is that the first k-1
queries failed, and the k'th query succeeded

The result we get if the k'th query hit is 'k' (because, in this case, we
did exactly k queries).

So, by the definition of expected value, this expected value is the infinite
sum over integer k > 0 (> 0 because we always do at least one query) of:

k * q^(k-1) * p

This is precisely the infinite sum that unruh gave, and if you evaluate it,
turns out to be 1/p, which (given our definition of p above) is the value
2^512.


--
poncho