From: Maaartin on
On Jun 14, 2:50 am, g...(a)nope.ucsd.edu (Greg Rose) wrote:
> Not correct I'm afraid. The input block to MD5 is
> 64 bytes.

For whatever reason I assumed it was 16 bytes, even while I was
looking at the source working with 16 ints. What I said obviously
can't apply to 16 bytes inputs.

> The padding consists of a byte 0x80
> (actually that's really just one bit), then a
> bunch of zeros, until the last 64 bits, which will
> have the length (128) in bits of the input, in
> some order which I can't remember at the moment.
> There is only one call to the compression
> function.

Sure. What I said would apply to fixed input size of 64 bytes, right?

> >I know there was a similar optimization for short messages in Luffa,
> >which was removed when advanced to the second round of SHA3 (see
> >Reason4Mod.pdf in
> >http://www.sdl.hitachi.co.jp/crypto/luffa/Luffa_v2_Round2Package_2009...).
>
> I can't comment on Luffa without going to look at
> it in more detail. Clearly something is different.

It probably doesn't matter at all, since many things are different. It
was just an example for me that such an optimization may fail.


On Jun 13, 11:04 pm, Maaartin <grajc...(a)seznam.cz> wrote:
> > > 3. Compute x ^ aes(k, x), where k is a fixed key. This is probably
> > > faster then computing md5(x) and I have no idea how secure it is.
>
> > It would be a blowout to AES if that was breakable. On machines with
> > hardware AES support (AESENC..) this is going to be pretty fast.
>
> Again, I'd love to see a proof. I'd suppose the previous case is way
> easier to prove, but even that is too hard for me now.

Isn't it just like the Matyas–Meyer–Oseas schema (with H[i-1] = k)?
However, I haven't found any proof for it.
From: Francois Grieu on
On 13/06/2010 23:04, Maaartin wrote
> On Jun 13, 7:12 pm, Francois Grieu<fgrieu(a)gmail.com> wrote:
>> On 13/06/2010 18:09, Maaartin wrote:
>>> 2. Compute x ^ aes(x, x). I don't know if this is faster then
>>> computing md5(x) and have no idea how secure it is.
>>
>> It would be a blowout to AES if that was breakable, even with x ^ removed.
> Could you prove it or give me good hints how to do it?
>
>>> 3. Compute x ^ aes(k, x), where k is a fixed key. This is probably
>>> faster then computing md5(x) and I have no idea how secure it is.
>>
>> It would be a blowout to AES if that was breakable. On machines with
>> hardware AES support (AESENC..) this is going to be pretty fast.
>
> Again, I'd love to see a proof. I'd suppose the previous case is way
> easier to prove, but even that is too hard for me now.

I think these can be proven rigorously under the random oracle model.

The idea is that you model AES for a fixed key as like a "random oracle"; that is, a resource which, given a 128-bit input plus one bit for cipher/decipher, returns a pair (plaintext,ciphertext) which is:
- for cipher, either a previously returned pair if there is one which plaintext matches the input, else a pair (input,output) with output picked at random among the 128-bit values not previously returned as the ciphertext in a returned pair;
- for decipher, either a previously returned pair if there is one which ciphertext matches the input, else a pair (output,input) with output picked at random among the 128-bit values not previously returned as the plaintext in a returned pair.

The odds that after n queries to such oracle an adversary with unlimited memory and computing power can have obtained from the oracle a (plaintext,ciphertext) pair such that plaintext^ciphertext = y for some arbitrary y can be rigorously computed. Any strategy that submits input such that neither input was a plaintext not input^y was a ciphertext is optimal. I'm too lazy to do the math now, but guess that odds that a match is found in 2^j queries is less than 2^(j-126).

An adversary using aes (cipher and decipher) as a black box (not considering the cipher's internal), even if she has otherwise unlimited memory and computing power, is bound to odds of success worse than 2^(j-126) if she uses that blackbox 2^j times to solve y = x^aes(k,x)


Francois Grieu
From: Francois Grieu on
[reposted with yet another correction]
On 13/06/2010 23:04, Maaartin wrote
> On Jun 13, 7:12 pm, Francois Grieu<fgrieu(a)gmail.com> wrote:
>> On 13/06/2010 18:09, Maaartin wrote:
>>> 2. Compute x ^ aes(x, x). I don't know if this is faster then
>>> computing md5(x) and have no idea how secure it is.
>>
>> It would be a blowout to AES if that was breakable, even with x ^ removed.
> Could you prove it or give me good hints how to do it?
>
>>> 3. Compute x ^ aes(k, x), where k is a fixed key. This is probably
>>> faster then computing md5(x) and I have no idea how secure it is.
>>
>> It would be a blowout to AES if that was breakable. On machines with
>> hardware AES support (AESENC..) this is going to be pretty fast.
>
> Again, I'd love to see a proof. I'd suppose the previous case is way
> easier to prove, but even that is too hard for me now.

I think these can be proven rigorously under the random oracle model.

The idea is that you model AES for a fixed key as like a "random
oracle"; that is, a resource which, given a 128-bit input plus one bit
for cipher/decipher, returns a pair (plaintext,ciphertext) which is:
- for cipher, either a previously returned pair if there is one which
plaintext matches the input, else a pair (input,output) with output
picked at random among the 128-bit values not previously returned as the
ciphertext in a returned pair;
- for decipher, either a previously returned pair if there is one which
ciphertext matches the input, else a pair (output,input) with output
picked at random among the 128-bit values not previously returned as the
plaintext in a returned pair.

The odds that after some number of queries to such oracle an adversary
with unlimited memory and computing power can have obtained from the
oracle a (plaintext,ciphertext) pair such that plaintext^ciphertext = y
for some arbitrary y can be rigorously computed. A strategy is optimal
iff it submits input+cipher such that neither input was a plaintext nor
input^y was a ciphertext, and/or input+decipher such that neither
input^y was a plaintext nor input was a ciphertext. I'm too lazy to do
the math now, but guess that odds that a match is found in 2^j queries
is less than 2^(j-126).

With a minor complication, this translates to similarly low odds of
success for an adversary using aes (cipher and decipher) as a black box
(not considering the cipher's internal), even if she has otherwise
unlimited memory and computing power, when trying to somve y = x
^aes(k,x) for known k. The complication is that we must restrict to y
such that at least one solution exists.


Francois Grieu
From: Mok-Kong Shen on
Maaartin wrote:
> I wonder if there's a more efficient way to compute a hash when I
> know, that all messages are of given length, say 16 bytes.[snip]

Certainly I don't understand your goal. Normally one hashes something
to something shorter. You have very short length to start with. To how
many bytes you want to hash or do you want a bijective mapping (in
which case any established block cipher would naturally provide such a
one but there might be other possibilities, I suppose)?

M. K. Shen

From: Tom St Denis on
On Jun 13, 9:43 pm, Maaartin <grajc...(a)seznam.cz> wrote:
> On Jun 14, 2:50 am, g...(a)nope.ucsd.edu (Greg Rose) wrote:
>
> > Not correct I'm afraid. The input block to MD5 is
> > 64 bytes.
>
> For whatever reason I assumed it was 16 bytes, even while I was
> looking at the source working with 16 ints. What I said obviously
> can't apply to 16 bytes inputs.

MD2 works on 16 byte blocks iirc.

MD5 has no expansion btw. It works on blocks of 64 bytes [16 32-bit
words] where it just permutes the selection of words for each round.
You actually have 55 bytes of useful payload before MD5 requires a 2nd
block.

You can also optimize MD5 quite a bit if you have fixed size inputs
[specially so small]. 10 out of the 16 reads from the traditional W[]
array [array of words for the block] are zero, so you can remove them
(and their related ops in the code). You also know the final padding
block so you don't need to compute/store/etc it. You also don't need
to present a traditional hash interface (init/process/done) since the
message is fixed length, so there is no overhead in gathering bytes
and/or double buffering, etc.

I wouldn't be surprised if a dedicated 16-byte version of MD5 ran
considerably faster than your standard off the shelf MD5 code.

But back to the AES route ...

H(x) = k ^ aes(x, k)

Where 'x' is the key and k is some fixed value is actually what these
class of hashes are (sha1, md5, etc...). You can create a generic
hash by saying

S_{-1} = some_known_value
S_{i} = S_{i-1} xor AES(M_i, S_{i-1})

Where you MD pad M appropriately (e.g. 1 bit followed by enough bits
to get block + length encoded).

So if you are hashing fixed 16-byte strings you could just pick some
random 128-bit value call that S_{-1} and ignore padding [as Greg
pointed out] and just compute S_0. I don't know if it's secure to use
a fixed key (for a hash that is), I certainly wouldn't recommend it.

One caveat though, the AES key schedule is actually vulnerable to
differential attacks [related keys] which actually makes AES bad for
this style of hashing.

Tom