Hashing of short fixed length messages [Cryptography]

Prev: JSH: Why being right isn't enough
Next: Double encryption method

From: Maaartin on 15 Jun 2010 02:07

On Jun 14, 8:47 am, Francois Grieu <fgr...(a)gmail.com> wrote:
> I think these can be proven rigorously under the random oracle model.

Your proof looks simple, but I need some time for it...

On Jun 14, 12:24 pm, Mok-Kong Shen <mok-kong.s...(a)t-online.de> wrote:
> Maaartin wrote:
> > I wonder if there's a more efficient way to compute a hash when I
> > know, that all messages are of given length, say 16 bytes.[snip]
>
> Certainly I don't understand your goal. Normally one hashes something
> to something shorter. You have very short length to start with. To how
> many bytes you want to hash or do you want a bijective mapping (in
> which case any established block cipher would naturally provide such a
> one but there might be other possibilities, I suppose)?

The input is a secret key used to encrypt some data. From the key I
need to obtain something usable as an id for the data. That's why I
only need the fixed sized input and why I'm only interested in first
preimage resistance.

On Jun 14, 1:47 pm, Tom St Denis <t...(a)iahu.ca> wrote:
> > For whatever reason I assumed it was 16 bytes, even while I was
> > looking at the source working with 16 ints. What I said obviously
> > can't apply to 16 bytes inputs.
>
> MD2 works on 16 byte blocks iirc.

This was not what confused me. I really don't know how did I get it
wrong, but it happens again and again.

> MD5 has no expansion btw. It works on blocks of 64 bytes [16 32-bit
> words] where it just permutes the selection of words for each round.
> You actually have 55 bytes of useful payload before MD5 requires a 2nd
> block.

Agreed (1 byte 0x80 and 8 bytes length eats from the input block).

> You can also optimize MD5 quite a bit if you have fixed size inputs
> [specially so small]. 10 out of the 16 reads from the traditional W[]
> array [array of words for the block] are zero, so you can remove them
> (and their related ops in the code). You also know the final padding
> block so you don't need to compute/store/etc it. You also don't need
> to present a traditional hash interface (init/process/done) since the
> message is fixed length, so there is no overhead in gathering bytes
> and/or double buffering, etc.
>
> I wouldn't be surprised if a dedicated 16-byte version of MD5 ran
> considerably faster than your standard off the shelf MD5 code.

The nice thing is I can verify the result easily. The bad thing is
that it eliminates quite a few instructions but not much cycles on a
superscalar processor.

> But back to the AES route ...
>
> H(x) = k ^ aes(x, k)
>
> Where 'x' is the key and k is some fixed value is actually what these
> class of hashes are (sha1, md5, etc...). You can create a generic
> hash by saying
>
> S_{-1} = some_known_value
> S_{i} = S_{i-1} xor AES(M_i, S_{i-1})
>
> Where you MD pad M appropriately (e.g. 1 bit followed by enough bits
> to get block + length encoded).
>
> So if you are hashing fixed 16-byte strings you could just pick some
> random 128-bit value call that S_{-1} and ignore padding [as Greg
> pointed out] and just compute S_0. I don't know if it's secure to use
> a fixed key (for a hash that is), I certainly wouldn't recommend it.
>
> One caveat though, the AES key schedule is actually vulnerable to
> differential attacks [related keys] which actually makes AES bad for
> this style of hashing.

The input itself is a hash too, so I can ignore related key attack,
right?

On Jun 14, 7:24 pm, Paul Rubin <no.em...(a)nospam.invalid> wrote:
> Can I ask what the application is, where speed is so important? If it's
> for passwords or key diversification, slow speed may even be a virtue.

This operation will be very common in the application, but I'm not
sure, if speed is important here. I'll know, when I get it running.
Slow speed can't be good here, it's nothing like password
strengthening. Most probably, I'll use some stronger and slower hash,
but I'm curious what can be done here.

> 0. Compute md5(x). AFAIK, MD5 is still secure w.r.t. the preimage,
> isn't it?
>
> Probably ok in practice, but MD5 is deprecated and if it's for a
> high-security application, I'd stick to something standard. Note MD5
> does have a theoretical preimage attack (Sasaki and Aoki, Eurocrypt
> 2009) which is not practical, but attacks only get better as they say.

The paper has got stolen by spinger/acm.

> 1. Take an algorithm like MD5 and remove the padding step. This leads
> to a speed-up factor of two, but I don't think it's secure.
>
> You mean just use the compression function once. Given MD5's problems I
> can't help wonder if the second compression isn't more useful than it
> it would be if the compression function worked as intended.

The compression function gets used only once for input sizes up to 55
bytes, anyway. I confused the numbers.

> 2. Compute x ^ aes(x, x). I don't know if this is faster then
> computing md5(x) and have no idea how secure it is.
>
> If AES is an ideal cipher, this should be ok. Assuming only that AES is
> a PRP, though I don't see a way to turn a break against it into a break
> against AES.
>
> 3. Compute x ^ aes(k, x), where k is a fixed key. This is probably
> faster then computing md5(x) and I have no idea how secure it is.
>
> If k is secret, then fine, but then it's not a hash function, since you
> have to keep the key secret. With reasonable constraints on the amount
> of hashes available to the attacker, You don't even need the (x ^)
> because of the PRP-PRF switching lemma. If k is public, you're in
> situation 2 again.

I can't keep k secret, it's to be shared among a group of people.

On Jun 14, 7:40 pm, Tom St Denis <t...(a)iahu.ca> wrote:
> > Found it:
>
> >http://www.cosic.esat.kuleuven.be/publications/article-48.ps
>
> > Turns out #7 from Table 3 is
>
> > H_i = H_{i-1} xor M_i xor E_{M_i}(H_{i-1})
>
> > Which is close to the OPs model. Where H_{i-1} might as well be zero,
> > the hash H = x xor AES(x, x) should in theory be secure provided x is
> > <= 16 bytes in length.
>
> Actually sorry #9 is closest (multitasking fail) where H_{i-1} can't
> be zero (but can be anything else).
>
> So
>
> H = x xor AES(x xor 0xDEADBEEF, x);
>
> Would be fine. Or for the more sensitive in the crowd
>
> H = x xor AES(x xor 0xEA71EAF, x);
>
> Should do :-)

I understand "dead beef", but what is 0xEA71EAF?

Unique mapping my original model onto schemas from the paper is not
possible, as I have no H[i-1].

Taking H[i-1]=0, my version 2 (x ^ aes(x, x)) is like schema 4, 9, 11,
29, 31, 37, or 39. All the schemata are susceptible to forward,
backward, permutation, or fixed point attack, but none of the attacks
makes sense when doing only a single step.

I don't think that setting H[i-1] to a non-zero constant makes any
difference, why should
H(x) = x ^ aes(x ^ 0xDEADBEEF, x);
be any more secure than
H(x) = x ^ aes(x, x);
? IMHO, what's important in the classificiation in the paper is how
the chaining value gets handelt; without the chaining value many
schemata became equivalent.

However, my version 3 (x ^ aes(k, x)) is like schema 3 (Matyas-Meyer-
Oseas) from the paper, right? According the the table it's secure and
it's faster then version 2.

According to what's written in paragraph "permutation attack" on page
3, x ^ encrypt(k, x) is a one-way function; no reason for this is
given.

From: Francois Grieu on 15 Jun 2010 02:54

On 14/06/2010 19:32, Tom St Denis wrote:
> On Jun 14, 1:24 pm, Paul Rubin wrote:
>> Maaartin <grajcar1(a)seznam.cz> writes:
>>> I wonder if there's a more efficient way to compute a hash when I
>>> know, that all messages are of given length, say 16 bytes. I need
>>> only
>>> (first) preimage resistance, nothing more. I'd like somebody to
>>> comment on the following possibilities:
(..)
>>> 2. Compute x ^ aes(x, x). I don't know if this is faster then
>>> computing md5(x) and have no idea how secure it is.
>>
>> If AES is an ideal cipher, this should be ok. Assuming only that
>> AES is a PRP, though I don't see a way to turn a break against it
>> into a break against AES.
>
> There was a eurocrypt paper that surveyed 64 permutations of turning
> a cipher into a hash. Of all of them but 16 are trivially weak, of
> those I think only a few are ideally strong [assuming a good PRP]. I
> totally forget the year or authors or title ... but it's in the 90s
> somewhere iirc.
>
> AFAIK S = x xor cipher_x(x) is not one of the secure hash functions.

then

> Found it:
<http://www.cosic.esat.kuleuven.be/publications/article-48.ps>

"Hash functions based on block ciphers: a synthetic approach"
Bart Preneel, Ren� Govaerts, and Joos Vandewalle, CRYPTO 1993.

> Turns out #7 from Table 3..
> actually sorry #9 is closest (multitasking fail) where H_{i-1}
> can't be zero (but can be anything else).

I disagree on "can't be zero". The above article consider a long
message, while the the thread is about "short fixed length messages", of
size no more than the key or block size of the block cipher.

> So
>
> H = x xor AES(x xor 0xDEADBEEF, x);
>
> Would be fine. Or for the more sensitive in the crowd
>
> H = x xor AES(x xor 0xEA71EAF, x);
>
> Should do :-)

In the random oracle model,
H = AES(x , x) and
H = x XOR AES(x XOR 0xEA71EAF, x)
have equivalent security: the left "x XOR" applies a key-dependent
reversible mapping to the ciphertext, and the right "XOR 0xEA71EAF"
applies a reversible mapping on the key; such changes turns a perfect
random oracle into another equally perfect random oracle.

These XOR may improve or worsen the security of an imperfect cipher,
depending on the imperfection. For example, one could slightly
strengthen a weak cipher by
BetterCipher(k,x) = WeakCipher(k XOR 0xEA71EAF,x) XOR k XOR 0xEA71EAF
but then
H = x XOR BetterCipher(x XOR 0xEA71EAF, x)
is actually
H = WeakCipher(x,x)

The security of
H = x XOR AES(k, x) with k a fixed constant (e.g. zero)
may be like one binary order of magnitude suboptimal, but it is still
high and provable in the random oracle model. If someone has a doubt,
I'll get a true proof.

Francois Grieu

From: Francois Grieu on 15 Jun 2010 05:10

[reposted with additions]
On 14/06/2010 19:32, Tom St Denis wrote:
> On Jun 14, 1:24 pm, Paul Rubin wrote:
>> Maaartin <grajcar1(a)seznam.cz> writes:
>>> I wonder if there's a more efficient way to compute a hash when I
>>> know, that all messages are of given length, say 16 bytes. I need
>>> only
>>> (first) preimage resistance, nothing more. I'd like somebody to
>>> comment on the following possibilities:
(..)
>>> 2. Compute x ^ aes(x, x). I don't know if this is faster then
>>> computing md5(x) and have no idea how secure it is.
>>
>> If AES is an ideal cipher, this should be ok. Assuming only that
>> AES is a PRP, though I don't see a way to turn a break against it
>> into a break against AES.
>
> There was a eurocrypt paper that surveyed 64 permutations of turning
> a cipher into a hash. Of all of them but 16 are trivially weak, of
> those I think only a few are ideally strong [assuming a good PRP]. I
> totally forget the year or authors or title ... but it's in the 90s
> somewhere iirc.
>
> AFAIK S = x xor cipher_x(x) is not one of the secure hash functions.

then

> Found it:
<http://www.cosic.esat.kuleuven.be/publications/article-48.ps>

"Hash functions based on block ciphers: a synthetic approach"
Bart Preneel, Ren� Govaerts, and Joos Vandewalle, CRYPTO 1993.

> Turns out #7 from Table 3..
> actually sorry #9 is closest (multitasking fail) where H_{i-1}
> can't be zero (but can be anything else).

I disagree on "can't be zero". The above article considers a long
message, while the thread is about "short fixed length messages",
of size no more than the key or block size of the block cipher.

> So
>
> H = x xor AES(x xor 0xDEADBEEF, x);
>
> Would be fine. Or for the more sensitive in the crowd
>
> H = x xor AES(x xor 0xEA71EAF, x);
>
> Should do :-)

In the random oracle model, for ciphertext = AES(key, plaintext),
the three functions x -> H(x):
H = AES(x , 0)
H = AES(x , x)
H = x XOR AES(x XOR 0xEA71EAF, x)
have equivalent security: the left "x XOR" applies a key-dependent
reversible mapping to the ciphertext, the middle "XOR 0xEA71EAF"
applies a reversible mapping to the key, and the rightmost x can be
regarded as key-dependent reversible mapping of plaintext 0.
Such changes turns a perfect random oracle into another equally
perfect random oracle.
When your consider the values H(x) for a number of distinct x, they
are random and independent.

These XOR may improve or worsen the security of an imperfect cipher,
depending on the imperfection. For example, one perhaps could slightly
strengthen a weak cipher by defining
BetterCipher(k,x) = k XOR 0xEA71EAF XOR
WeakCipher(k XOR 0xEA71EAF, x XOR k XOR 0xEA71EAF)
but then
H = x XOR BetterCipher(x XOR 0xEA71EAF, x)
is actually
H = WeakCipher(x,0)

The security of
H = x XOR AES(k, x) with k a fixed constant (e.g. zero)
may be like one binary order of magnitude suboptimal, but it is still
high and provable in the random oracle model. If someone has a doubt,
I'll get a true proof.

Francois Grieu

From: Tom St Denis on 15 Jun 2010 06:54

On Jun 15, 2:54 am, Francois Grieu <fgr...(a)gmail.com> wrote:
> > Turns out #7 from Table 3..
> > actually sorry #9 is closest (multitasking fail) where H_{i-1}
> > can't be zero (but can be anything else).
>
> I disagree on "can't be zero". The above article consider a long
> message, while the the thread is about "short fixed length messages", of
> size no more than the key or block size of the block cipher.

I think the presumption is that H_{-1} is not zero, so in #9 you can't
have an unmodified key.

> In the random oracle model,
> H = AES(x , x) and
> H = x XOR AES(x XOR 0xEA71EAF, x)
> have equivalent security: the left "x XOR" applies a key-dependent
> reversible mapping to the ciphertext, and the right "XOR 0xEA71EAF"
> applies a reversible mapping on the key; such changes turns a perfect
> random oracle into another equally perfect random oracle.

I'd think for a single block it's probably ok, the point I was making
is that #9 doesn't have a non-zero key modifier.

Tom

From: Greg Rose on 15 Jun 2010 13:40

In article <485a0ba3-f470-4689-b6c1-2b80e3f51abf(a)w31g2000yqb.googlegroups.com>,
Maaartin <grajcar1(a)seznam.cz> wrote:
>> H = x xor AES(x xor 0xDEADBEEF, x);
>>
>> Would be fine. Or for the more sensitive in the crowd
>>
>> H = x xor AES(x xor 0xEA71EAF, x);
>>
>> Should do :-)
>
>I understand "dead beef", but what is 0xEA71EAF?

"eat leaf" -- vegetarian version.

Greg.
--

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8
Prev: JSH: Why being right isn't enough
Next: Double encryption method