From: Mr. B on
Ryan Chan wrote:

> http://en.wikipedia.org/wiki/Message_authentication_code
>
> Seems most Linux ISO download site give MD5 checksum of ISO file as a
> kind to validate the integrity of the file, why people can't call hash
> function (e.g. MD5, no key, no salt) as MAC?

It is not a MAC because it does not authenticate anything -- you have no
guarantee that the ISO you downloaded was not tampered with by a third
party, who could easily have computed the MD5 sum of their modified ISO.
The hash is only given to you so that you can verify that your download was
not corrupted while being transported, but alone it cannot be used to verify
that the ISO came from the people you expected it to come from.

-- B
From: Tom St Denis on
On Jun 8, 8:25 am, Bryan <bryanjugglercryptograp...(a)yahoo.com> wrote:
> > If you know with 100% certainty that the hash was delivered intact
> > then you have guaranteed integrity, it's still not authenticity.
>
> That's not what jbriggs444 said. If you trust the hash is *authentic*,
> then you can use it to authenticate the message.

This is akin to transmitting a hash over say a TLS connection. You
know that the hash made it there alright. The thing is the real
authenticity comes from the MAC provided by TLS, not the hash itself.
It's one of those Zen moment things.

> I'm kind of uneasy about the integrity/authenticity distinction. It
> seems to mislead people more than it clarifies anything. The important
> factor is the adversarial environment. If we ignore the adversarial
> element, the goal is simply robustness in the face of communication
> errors and authenticity does not come up.

The point of the distinction is the type of security it provides.
Integrity protects against unintentional errors, whereas authenticity
protects against intentional errors [attempted forgeries]. The point
of an algorithm that provides integrity is to protect against
[usually] bit and burst errors. That's why things like RS and CRC
codes are fine enough. The point of an algorithm that provides
authenticity is to protect against people who are trying to forge
messages, so things like linear codes just won't do.

The way I keep it distinct in my head is to think "intact vs.
authentic."

> In the face of possible enemy action, integrity and authenticity are
> inseparable. I don't much care that my friend sent a message if the
> version I receive was changed by the enemy.  I don't care if a message
> was delivered intact if might actually have been forged by the enemy.

Authentic usually implies intact, but not the reverse. You can have a
completely well maintained forgery of an antique lamp for example. It
is intact, has fidelity, etc. Whereas, to have a good authentic lamp
it has to both be from the supposed designer and intact.

> That's not how I read it. The OP seemed to be citing the use of
> cryptographic hashes in open-source file distribution, and asking why
> other applications do not use a hash in place of a MAC. I think
> there's a serious misconception there.

I read it the other way around. Hmm.

> Why do sites such as SourceForge provide cryptographic hash digests of
> the distribution files they host? I know of two good reasons: One is
> that the hash can check against random communication errors. The other
> is to allow users to download the distribution files from less
> trustworthy sites, then use the digest from SourceForge to verify that
> they got the same thing.
>
> Here's the scary part: Many otherwise smart people download the
> distribution files and the cryptographic digests from *the same* site,
> and when they check the digests they think they're doing something to
> verify security. I suspect the OP question was based on just such a
> misconception.

Many distros also have GPG signatures. The MD5SUM files are [as you
pointed out] only there to catch transmission errors.

> Also, this thread should probably note that in 2004 Xiaoyun Wang,
> Dengguo Feng, Xuejia Lai, and Hongbo Yu broke MD5. The attacks have
> gotten  better since.

Which are totally irrelevant for the purposes of a highly effective
integrity algorithm. Heck, MD4 would do as well for the purpose [and
it's faster].

Oddly enough, MD4 would also do for HMAC too, but you won't see many
people rushing off to promote that idea ... :-)

Tom