From: Paul Rubin on
Paul Rubin <no.email(a)nospam.invalid> writes:
> I have an idea for an improvement, that I"ll try to work out and write
> up later.

OK, how does this sound:

You have n accounts, numbered 1...n. You want to be able to verify
logins, to let the user change their own password, and to lock the
account if there are too many unsuccessful login attempts. You want
validation to depend on a hardware token (successful validation can
issue a credential signed by the token), and validation and password
change should both resist database replay attacks (i.e. if someone
changes their password, then restoring an old copy of the database
shouldn't re-enable the old password). The unsuccessful login lockout
should also resist database replay. That means that every password
change or unsuccessful login should change some internal state in the
hardware token, which has limited internal memory.

So let's store the hashes in a database, with rows 1...n. Each row will
(conceptually) contain a unique key for that row (generated by the
token), a password hash with IV, and a login failure count, plus
possible timestamps and things like that if desired, with the whole row
encrypted and MAC'd by the token. The encryption and MAC keys are
derived from the unique key for the row (the "row key"). The row key is
derived as explained further down. The database must be able to
atomically update O(log n) rows in a single transaction. That can be
done with a typical SQL database, or with a functional data structure
like an AVL tree serialized to disk, or whatever. The db also has to be
able to undo its most recent committed transaction (just keep a record
of the values before the update).

We can view the encrypted row keys as a binary tree, with uid 1 at the
root and uid's 2k-1 and 2k as the children of uid k. uid k's key
encrypts the keys of both its children. uid 1's key is encrypted by a
master key M inside the hardware token. So decrypting uid k's key
requires using the complete chain of keys starting at the token's master
key down to uid k, i.e. it can be encapsulated inside the token and
requires log2(k) decryptions. Updating the row key for uid k (replacing
the key with a new random one) requires updating its sibling (uid (k xor
1)) and the complete chain of parent keys and their siblings,
i.e. 2*log2(k) operations. Also, k's two children have to be rewritten
to migrate the encryption of their (unchanged) row keys to k's new key.

When a person wants to log in to a given uid U and presents a password
to check, do the following:

1. The application reads the chain of db entries for the uid and its
children, parent, grandparent, etc. up to the root (db[2*U], db[2*U-1],
db[U], db[U//2], db[U//4], ..., db[1]) and the siblings of each of these
nodes. It sends this chain to the hardware token (in the case of tokens
with small memory like cheap smart cards, this can be sent and processed
as multiple smaller messages--they don't have to all be in memory at
once, just a few chaining keys).

2. The token uses its master key M to decrypt the chain of row keys
starting at the root. It generates new row keys for each row in the
chain, re-encrypting and MAC'ing all the requisite fields and making an
updated version of each row (call this new chain the "update list").
The token examines (and verifies the MAC) of the failure count for uid
U, and if it's already too high, the login attempt fails. If it's not
too high, the token checks the password, and uses the password to
generate the new row for U. If the password is incorrect, the token
increments the failure count by 1 in U's slot of the update list. If
the password is correct, the failure count is set to zero. Row 1 of the
update list's row key is encrypted by a newly generated key M2, created
and saved in the token. The token then sends the update list back to
the host and asks it to update the db by atomically replacing all those
rows. The host db is now in a state indicating whether U's password
check has failed or not, but (because U's row is encrypted by the token)
the host can't yet tell which of these conditions applies. (The token
may also emit an encrypted audit record saying what has happened).

3. The host now tells the token that it has committed the update. After
receiving this notice, the token does its own commit by internally
replacing M with M2. In the success case, the token sends the host a
login credential; and in the failure case, it informs the host of the
failure. Either way, the host doesn't learn of the password's success
or failure, until after the failure count is committed and synchronized
in the db and token.

4. If the application fails to commit (such as by crashing) it can ask
the token to roll back its side of the transaction. The rollback is ok
since the token has not yet given a credential, so the host hasn't
learned anything, and it can try again without penalty. (There is a
possible issue that the host may have given the -correct- password and
gotten the failure count reset without getting a credential, which might
evade some auditing generated when issuing credentials--I don't know if
it's worth making some adjustments to deal with this, especially if the
token is also generating some other audit trail).

5. If the application commits but the token crashes before it can do its
own commit, the application may have to undo its last transaction, which
it can do by keeping the old data around temporarily.

6. Password change is basically similar to the above.

7. In a high security (e.g. financial) application, the version of M
that the token uses for normal login has no capabilities other than
described above. Resetting a lost password can require a separate copy
of M (made within the token) with extra capabilities, accessible only
through a blob secured by another key, e.g. on a smart card held by a
customer service representative, or by a special role on the token, etc.

8. It may be possible to save some space in the rows by not storing the
uid in them, since the token always knows which row it is examining or
encrypting. The row number can instead be included in the key
derivation process. That means if someone tries to substitute one row
for another, the MAC will still fail even though there is no uid under
the MAC.

I could imagine this being useful in something like an OpenID server,
where the password check and the relying service are run by separate
entities, and a password break could affect many such services. I may
try coding it. It is similar to (and somewhat inspired by) something I
did for secure erasure a while back:

http://groups.google.com/group/sci.crypt/browse_thread/thread/56bee118fd127d01/8ce7eb156bc6085d
From: Kristian Gj�steen on
Paul Rubin <no.email(a)nospam.invalid> wrote:
>Paul Rubin <no.email(a)nospam.invalid> writes:
>> I have an idea for an improvement, that I"ll try to work out and write
>> up later.
>
>OK, how does this sound:

>[secure external storage used for password storage]

Could you store the password hashes only as leaf nodes in your tree?
That might reduce the token's cryptographic workload, especially if your
per-password record are largish (several times the key size).

>I could imagine this being useful in something like an OpenID server,
>where the password check and the relying service are run by separate
>entities, and a password break could affect many such services.

Is the following correct? The idea is to protect the users against
compromise of the OpenID server, by essentially removing the password
database from the OpenID server and placing it in secure hardware.
When the OpenID server is compromised, anyone logging in will have their
password compromised, but anyone who doesn't log in will not have their
password compromised.

Also, the credential is important. The relying party should be able
to verify the credential, but not forge it. If this holds, the relying
party can use the credential as evidence that the token was involved in
the login process.

--
Kristian Gj�steen
From: Paul Rubin on
Kristian Gjøsteen <kristiag+news(a)math.ntnu.no> writes:
> Could you store the password hashes only as leaf nodes in your tree?
> That might reduce the token's cryptographic workload, especially if your
> per-password record are largish (several times the key size).

Sure, that is reasonable. The hashes don't have to be in the tree at
all. Only the row keys really have to be there. But, I think that
except for the smallest tokens, the cryptographic workload is not that
large to begin with. 2 * lg n * the record size = perhaps a few hundred
primitive crypto operations.

> Is the following correct? The idea is to protect the users against
> compromise of the OpenID server, by essentially removing the password
> database from the OpenID server and placing it in secure hardware.
> When the OpenID server is compromised, anyone logging in will have their
> password compromised, but anyone who doesn't log in will not have their
> password compromised.

Yes, good point, I was thinking more about offline attacks using a
captured host, than compromising the host but keeping it running. To
protect passwords of people logging in, the token could have its own SSL
stack and handle the login/password dialog internally, using the host as
encrypted external storage.

For that matter, though, maybe tokens now available have enough internal
persistent storage (multi-GB flash chips are cheap) to just hold the
whole database. Then it doesn't need the tree structure or anything
like that. I don't know what's happening with hardware tokens these
days. Last time I had anything to do with them, the fancier ones were
getting to be almost like PC's, which didn't seem like such a good
thing. Maybe they're even more like that now.

> Also, the credential is important. The relying party should be able
> to verify the credential, but not forge it. If this holds, the relying
> party can use the credential as evidence that the token was involved in
> the login process.

Also a good point. If the credential has a public-key signature from
the token, thogh, that's probably more cryptographic workload than all
the symmetric stuff described earlier.
From: Kristian Gj�steen on
Paul Rubin <no.email(a)nospam.invalid> wrote:
>If the credential has a public-key signature from
>the token, thogh, that's probably more cryptographic workload than all
>the symmetric stuff described earlier.

For sure.

I find that when you start playing with a light-weight idea like this,
you start adding new features and suddenly you need an HSM rather than
a light-weight token.

--
Kristian Gj�steen