From: Christian Baer on
On 28 Jan 2010 13:27:33 GMT Thomas Pornin wrote:

> CBC, XTS... and pretty much all other modes begin to have trouble,
> security wise, when the amount of data to encrypt reaches about
> 2^(n/2) blocks, where blocks have n bits. If you use a 64-bit block
> cipher, then that threshold is 64*2^32 bits, i.e. about 32 GB. If
> you want to encrypt data which can be counted in gigabytes, then
> you want to use a block cipher with 128-bit blocks. This means AES,
> but not Blowfish.

Would you care to explain where this threshold comes from? If it's rather
lengthy (which I kind of expect), just point me to a source where I can
read up this gap in my education. :-)

Two little annotations, maybe... :-)

1.
I chose Blowfish due to recomendations of a cryptographer on a
CCC-convention. He said he distrusted AES because it is a very
mathematical cipher and this characteristic would open doors for attack.
He explicitly recommended Blowfish if what you wanted to do didn't have to
be be compatible to anything else. I know that no worthwhile attacks have
been made on Blowfish - although it is pretty old (for a cipher) and
pretty widely deployed. The size limit you stated was not mentioned.

2.
The article on disc encryption theory in the Wikipedia states that XTS
should only be considered secure when the amount of data encrypted
(meaning the file system, I guess) is not much bigger thatn 1TB. The
limit you mentioned seems to be much bigger (something like 2,7E11GB) for
128-bit ciphers.

> Apart from that, a properly used CBC mode will not be weak. But in some
> setups, it is difficult to use CBC "properly". The same goes for any
> mode.

That is a little cryptic. :-)
What do you mean with "properly" and what could make it difficult to use
CBC properly? Are we talking software or human limitations here?

> XTS was designed for problems which look like disk encryption,
> where you want to be able to process each sector independently of each
> other; with CBC, this raises the problem of IV management.

This is a bit cryptic again. Sorry. :-)

I know what IVs are and why they are needed for CBC (which basicly makes
the diff between EBC and CBC). I just don't quite get the connection
between CBC and XTS here.

> Then again, a proper encryption layer does not qualitatively change your
> problem. It changes it quantitatively: instead of worrying about
> confidentiality of the complete data set, you just have to worry about
> confidentiality of the secret key. The key is much shorter, and also
> more "stable" (it can be kept unchanged even if some of the data is
> modified). But the main issue is still there. Encryption has just
> "concentrated" the problem into secrecy of the key.

I know about this problem and I have thought about it too. :-)
The point is: You have to start at some end. If I have a secure system (in
terms of hard- and software), I have to worry about human failure. This is
something I can asses and work with. Assesing the security of the hard-
and software is much harder (for me) and if this part is insecure, I don't
really have to bother with securing the keys and limiting access to them.

In general I'd say that a problem which is "concentrated" (using you word
which I think really hit the spot) is much easier to deal with than a
problem that is blurred (and with that out of focus) and spread all over
the place.

> Yet this opens some new ways. For instance, I assume that your server
> not only stores the data, but may also feed it back (in cleartext) to
> allowed clients. Therefore the "live" server must have access to the
> data. If encryption is used, this means that the server must have access
> to the secret key. Since the key is short, you may keep it in RAM only
> (if you write down the key on the server disk then you have gained
> nothing in the process). Keeping the key in RAM only protects the data
> from someone who can steal the machine but not keep it "up": apparently,
> data stored in RAM gradually loses accuracy when power is shut off, and
> about 5 minutes later very little remains of it. This may thwart _some_
> categories of attackers (those who may have only very transient physical
> access to the machine -- less than a few minutes -- and thus do not have
> time to either divert power to a portable battery, or uncover the
> machine and dump the RAM contents).

Most encryption software alles for wiping the key from RAM. This makes
attacks with a power source or coolant spray - provided the intrusion is
detected early enough.

I'm not really sure how dangerous these coolant attackes really are. I
have read articles about how the information in RAM deteriates rather
slowly when the power is cut off. Photos can still be made out quite well
if about 50% of the information is still intact. But does this also apply
to the keys we are talking about here? If the key in use has 256 bits,
then 50% would be 128 bits - which is still somewhat of a challenge. But
let's assume that 75% of these 256 bits are still intact, leaving 64 bits
that are now. This is a challenge that could be met by today's means.
However, knowing that 75% of the key is intact is a stastistical value
which doesn't say which part of the key is intact and which part isn't.
Finding that out could be a daunting task in itself.

> On the other hand, having the key in RAM only means that a trusted
> operator much be present at each reboot.

This is the way it would be in my plans.

> Any weakness in the cipher, including using a 64-bit block cipher with
> too much data, is very unlikely to be the most serious problem in your
> setup.

I know that - well, I guessed it. :-)
My goal is just to secure the machine as well as I can, so I don't have to
worry about it and (only/mainly) have to worry about the human part of the
equation.

BTW. This thread is acutally getting quite interesting. I hope it will
still live for a while. :-)

Regards,
Chris
From: Kristian Gj�steen on
Christian Baer <christian.baer(a)uni-dortmund.de> wrote:
>On Thu, 28 Jan 2010 12:01:08 +0000 (UTC) Kristian Gj?steen wrote:
>>>I have to build and install a server that will be used to store very
>>>sensitive information - medical data included. Securing this server from
>>>hackers isn't my concern here. I have to make sure that it is as hard as
>>>possible to get at the information if the whole thing is stolen.
>> AES-CBC with some minor twists will do for this threat model.
>
>What do you mean with "some minor twists"? Secret IVs? That should be a
>normal feature. Or is there something else that I missed?

You need to ensure unpredictable per-sector IVs. As far as I can
remember, geli does that. gbde also does this by storing random keys.
You could also do it by storing random IVs.

I sketched a security proof for the cryptography underlying gbde in a
2005 paper, "Security notions for disk encryption".

--
Kristian Gj�steen
From: Kristian Gj�steen on
Christian Baer <christian.baer(a)uni-dortmund.de> wrote:
>On 28 Jan 2010 13:27:33 GMT Thomas Pornin wrote:
>
>> CBC, XTS... and pretty much all other modes begin to have trouble,
>> security wise, when the amount of data to encrypt reaches about
>> 2^(n/2) blocks, where blocks have n bits. If you use a 64-bit block
>> cipher, then that threshold is 64*2^32 bits, i.e. about 32 GB. If
>> you want to encrypt data which can be counted in gigabytes, then
>> you want to use a block cipher with 128-bit blocks. This means AES,
>> but not Blowfish.
>
>Would you care to explain where this threshold comes from? If it's rather
>lengthy (which I kind of expect), just point me to a source where I can
>read up this gap in my education. :-)

If you get collisions, security is lost. You'll get collisions because
of the birthday paradox.

>Two little annotations, maybe... :-)
>
>1.
>I chose Blowfish due to recomendations of a cryptographer on a
>CCC-convention. He said he distrusted AES because it is a very
>mathematical cipher and this characteristic would open doors for attack.
>He explicitly recommended Blowfish if what you wanted to do didn't have to
>be be compatible to anything else. I know that no worthwhile attacks have
>been made on Blowfish - although it is pretty old (for a cipher) and
>pretty widely deployed. The size limit you stated was not mentioned.

Don't use Blowfish. Don't listen to people who prefer Blowfish over AES.
And don't listen to people of regurgitate vague feelings people had ten
years ago when AES was new.

The problem with Blowfish is that it has a 64 bit block size, and that
is too small. You run into security problems after encrypting mere
gigabytes.

>2.
>The article on disc encryption theory in the Wikipedia states that XTS
>should only be considered secure when the amount of data encrypted
>(meaning the file system, I guess) is not much bigger thatn 1TB. The
>limit you mentioned seems to be much bigger (something like 2,7E11GB) for
>128-bit ciphers.

With 128 bit blocks, you have a probability of collisions of about 2^-48
after encrypting 2^40 blocks. If you encrypt as much as 2^64 blocks,
you will most likely have collisions and security will be lost.

>> Apart from that, a properly used CBC mode will not be weak. But in some
>> setups, it is difficult to use CBC "properly". The same goes for any
>> mode.
>
>That is a little cryptic. :-)
>What do you mean with "properly" and what could make it difficult to use
>CBC properly? Are we talking software or human limitations here?
>
>> XTS was designed for problems which look like disk encryption,
>> where you want to be able to process each sector independently of each
>> other; with CBC, this raises the problem of IV management.
>
>This is a bit cryptic again. Sorry. :-)
>
>I know what IVs are and why they are needed for CBC (which basicly makes
>the diff between EBC and CBC). I just don't quite get the connection
>between CBC and XTS here.

CBC needs unpredictable IVs. XTS can use predictable tweaks. So for
CBC, you derive IVs from some secret and the sector number. For XTS,
you use the sector number and index within the sector.

--
Kristian Gj�steen
From: Thomas Pornin on
According to Christian Baer <christian.baer(a)uni-dortmund.de>:
> Would you care to explain where this threshold comes from?

A block cipher is deterministic: with the same key and the same input
block, you get the same output block. This can leak information (with
more or less obvious consequences, often more deadly than assumed at
first glance). It so happens that "practical" data has redundancy
(unless your data is already pure random, but, let's face it, one rarely
stores gigabytes of pure random). Without such redundancy, ECB would be
fine.

CBC is a chaining mode which XORs the input block with the output of the
previous block encryption as an effort to "randomize" data. This works
well, but only up to the theoretical limit of 2^(n/2) blocks for a n-bit
block cipher. This comes from the so-called "birthday paradox": if you
take random values in a space of size N, the probability of a collision
raises gradually, then quite sharply around sqrt(N) values. CBC does a
good job at approaching that theoretical limit, but it cannot exceed it.
For N = 2^64 (64-bit blocks), the limit is around sqrt(N) = 2^32 blocks,
which means 32 GB. To make the problem less likely, one should strive to
remain much below that limit (at 32 GB, the issue becomes "frequent",
but for proper security you usually want issues to be "extremely rare"
and not merely "rather infrequent").

About all chaining modes have such issues, because the limit is
fundamental. Another way of stating it is that a block cipher is a
pseudo-random _permutation_, which is only an approximation of a
pseudo-random _function_, and the difference becomes distinguishable
when the amount of data exceeds the square root of the input space.

This is the main reason why, when the AES competition was launched,
candidates were expected to have 128-bit blocks. Otherwise, we would
still use Triple-DES.


CBC has the additional requirement that the IV must be uniformly random.
You shall not reuse an IV, but you must not use a predictable regular
sequence either, because otherwise the IV would not do a good job at
"randomizing" the first data block. Having a random IV is cumbersome in
some setups, most notably disk encryption: for disk encryption, you want
to encrypt a 512-byte block into a 512-byte block, and there is no extra
room to store the IV. This implies generating IVs from an additional
keyed PRNG, and the complexity rises. Some more recent modes, such as
XTS, can work with a non-random IV (as long as IV are not repreated),
which becomes convenient for disk encryption, where each sector has an
address.


> 1. I chose Blowfish due to recomendations of a cryptographer on a
> CCC-convention. He said he distrusted AES because it is a very
> mathematical cipher and this characteristic would open doors for
> attack. He explicitly recommended Blowfish if what you wanted to do
> didn't have to be be compatible to anything else. I know that no
> worthwhile attacks have been made on Blowfish - although it is pretty
> old (for a cipher) and pretty widely deployed. The size limit you
> stated was not mentioned.

Failure to mention the size limit implies that the cryptographer you
met did not tell you the whole story, and he actually left out the most
important part.

The "mathematical structure" of AES has been used two ways. Attempts
at attacks on AES are called "algebraic attacks" and have failed to
break the system. On the other hand, the presence of a structure means
that some security properties can be thoroughly investigated.
Blowfish is just a "big mess of bits" and the security of Blowfish
relies on something which runs like "the whole thing is messy, we
cannot state anything about it". On the face of it, the much clearer
structure of the AES looks like a more sound basis for security.

(As usual, with encryption you want to things:
1. not to be attacked
2. to be able to sleep at night
Point number 2 requires some level of trust in the algorithm robustness,
and personally I find it difficult to trust an algorithm which is so
messy that it bored researchers away. I really prefer an algorithm where
smart mathematicians are able to say "this is robust and I know why",
rather than "this may be robust because for the life of me I cannot
fathom anything from it".)


Now if you really want to stick with Blowfish, you may consider its
brother Twofish, from the same authors but with an enlarged block size.
Twofish was a candidate for the AES and received much more scrutiny. It
is also quite faster than Blowfish.

(At some point you might notice that key schedule of both Blowfish and
Twofish is inherently very slow. This may or may not be a problem in
your application.)


A final argument for AES is that the newest Intel x86 CPU have special
opcodes for computing AES in hardware (the "AES-NI" instruction set) ,
and performance basically rocks. With any other block cipher, you are on
your own. Then again, depending on the envisioned bandwidth, this may or
may not be a problem. A modern PC with modern disks has enough juice to
keep up with disk bandwidth using only one of its CPU cores, with a
software-implemented AES or Twofish. With faster disks (e.g. RAID6 or
the like) you may need more cores. This is rarely an issue; it takes a
quite specific situation to make encryption a bottleneck. But if that
happens and you used AES, then AES-NI may save your day, and you will
thank me.


> I'm not really sure how dangerous these coolant attackes really are. I
> have read articles about how the information in RAM deteriates rather
> slowly when the power is cut off. Photos can still be made out quite well
> if about 50% of the information is still intact. But does this also apply
> to the keys we are talking about here?

As far as I have read, after one minute or so you can still get about
98% of the data. That's quite sufficient for an attack. Moreover, the PC
can be uncovered while still being up and running, to the attacker could
extract the RAM modules and insert them in a custom read in much less
than a minute.

(Note: with AES-NI, the key, after key schedule, is kept only in the XMM
registers, which are within the CPU; but registers are regularly flushed
to RAM, because this is an integral part of how multi-tasking operating
systems succeed in running several applications concurrently. So the key
will make it to the main RAM, whatever you do.)


> > On the other hand, having the key in RAM only means that a trusted
> > operator much be present at each reboot.
>
> This is the way it would be in my plans.

Note that beyond the implied cost (humans are not free), this also has
implications on the level of physical armoring around the machine. If
nobody has to come to the machine under normal conditions, then you can
put it in a cage in a bunker, guarded by grumpy pitbulls. Security
is much easier when attackers cannot access the machine physically.


--Thomas Pornin
From: Thomas Pornin on
According to Christian Baer <christian.baer(a)uni-dortmund.de>:
> With your reference to ECB, I'm guessing you are talking about watermarking.

I am not, unless you use the term "watermarking" with a meaning which I
am not aware of. As far as I know, "watermarking" designates some
techniques to embed a hidden mark into some data, such that the mark
resists alterations to the data, such as lossy compression, while not
impacting the qualitative aspects of the marked data. I fail to see how
this relates to ECB.


> Actually, I thought that 3DES had pretty much come to the end of it's
> life because of the many other flaws there were (key length, weak keys,
> very slow performance etc.) with the longer block size really only being
> a bonus. But I can't say that I am really well informed in that field.

It is the other way round. The 64-bit block size was the main reason why
a new standard was needed. As for speed, the call for candidates to the
AES competition stated that candidates should not be slower than 3DES,
so the poor software performance of 3DES was a minor concern; for most
applications, especially on big software platforms (read: a PC),
performance is not a big issue and 3DES was fast enough. That most AES
candidates turned out to be much faster than 3DES was considered to be a
bonus.

Weak keys (and pairs of related keys) are not a concern as long as you use
3DES for encryption and not as, for instance, a building block for a hash
function. Note that the same applies to Rijndael, which was not evaluated
for use as compression function in a hash function (the Whirlpool hash
function uses a Rijndael-derivative as compression function, but with a
much altered key schedule precisely for that reason).

3DES uses a nominally 192-bit key. 24 of those bits are not used, so the
key is really a 168-bit key. Also, there is a rather generic attack
which can theoretically break it in 2^112 steps, assuming that you have
a few billions of gigabytes of very fast RAM, something which seems
technologically doable, but not cheaply. With a regulation-oriented
mind, you can then declare 3DES to offer "112 bits of security" and
that's exactly what NIST does. 112 bits are not bad at all. AES was
required to use a 128-bit key because 128 is more "round" in binary, and
cryptographers are no less superstitious than anybody, so they love
round numbers. This relates more to marketing than science, but half of
making a secure cryptosystem is to create trust, and trust requires
proper marketing.


> Which leads me back to my initial question:
> Does XTS actually improve the security compared to CBC?

XTS improves security over CBC in the following sense: XTS only requires
distinct IVs, whereas CBC requires distinct _and_ uniformly randomized
IVs. In some setups it is hard to have the kind of IV that CBC requires,
a prime example being harddisk encryption. "Hard" means "the implementer
will have to resort to additional trickery", making the system more
complex to analyze and extending the scope of bugs.

It does not mean that CBC is weak; only that it is easier to get XTS
"right".


> One of the things that made me doubt AES partly was the stuff I read
> while the finalists were being analysed. Twofish, Serpent and MARS
> were considered being highly secure, while RC6 and Rijndael were
> "only" sufficiently secure. I know I'm a little paranoid at times, but
> this looked a bit like people leaving an option for a back door to me.

This is a failure of marketing. When together, cryptographers do not
speak English but crypto-English, a scientific idiom in which some words
have a subtly different meaning. The point that cryptographers wanted to
make was that the 5 finalists, and actually most of the 10 other
systems, looked as robust as could be wished for. But they also wanted
to nitpick a bit, because that's what experts are paid for. The stuff
you read was some of that nitpicking, expressed in crypto-English.

In the end, NIST chose Rijndael mostly because it could be implemented
efficiently (and compactly) on all platforms, with very decent
performance, and some less measurable bonus, including the mathematical
structure (Rijndael is not a big mess of bits in which backdoors could
lurk, we kind of know why it is secure) and the political advantage of
choosing a candidate from Belgium (NIST knows that the all-american
history of the birth of DES, with NSA as a godmother, proved to be a
potent fuel for all kind of paranoia, and paranoia means bad marketing).


> Just out of curiousity, I'd like to know if you have any feelings
> about Camellia.

I have not looked at it closely. But it was investigated during the
NESSIE project, and apparently it remained unscathed. Anyway, I do not
trust myself to decide whether an algorithm looks good or not. I rather
trust the collective mind of hundreds of cryptographers, and this is why
I trust AES: many smart people studied it, and none of them found an
actual weakness (although some found ways to express security
quantitatively, which is good because it means that the problem could be
studied).


> Define "newest", please. Are you referring to the i5 and/or i7?

I refer to the subset of the i5/i7 which have the 32nm foundry. As
usual, names under which CPUs are sold do not capture the wide
variety of models. For details see:
http://en.wikipedia.org/wiki/AES-NI


> Currently, this doesn't change much for me because I don't have any
> Intel CPUs in use.

Unfortunately, computers sometimes die and must then be changed. Using
AES for the sake of AES-NI is a way to ease _future_ usage, when you
will buy a new computer. It may make the data transfer easier (simply
plug the disks in the new machine). Depending on your situation and why
you want to encrypt data, this may or may not be an important feature.


> I'd say it would be much better to keep the key in the RAM of the
> machine but in a protected area so it doesn't get moved about.

My point is that a plain computer _does not_ have such a protected area.
To some extent, the contents of the CPU registers could be deemed to be
a protected place, but they are flushed to RAM often enough that this
does not hold.

There are people who sell "protected areas" for computers. These are called
HSM, as "Hardware Security Module". For pointers, see:
http://en.wikipedia.org/wiki/Hardware_security_module
HSM are expensive, but that is somewhat intrinsic to what a HSM is trying
to achieve. Physical resistance _and_ good performance, that's not cheap.


> Most encryption software has an option to wipe the key from the
> memory.

Your running system must know the key as long as it accesses the data,
so any kind of "wiping out" is not an option in the attack model where
the attacker can physically approach the running system.


> The machine won't be in a bunker or the like but it will be locked
> away. It could also help if finding the machine is a bit of a
> challenge. Meaning, a machine in a cage where everyone can see it
> might make it clear where the crown jewels are. :-)

Making the machine inconspicuous is a bit like code obfuscation: it adds
some security, but it is very difficult to know how much security it
adds. When I design a system, I prefer to think about how rich the
attacker is, rather than about how smart he is, because wealth is easily
measured (e.g. in dollars), whereas smartness is not. A big cage is
visible, but there is ample data which helps estimate the kind of
resource that the attacker will need to break into it (mankind has been
building cages for quite a long time). It is much harder to quantify how
much "hidden" a computer is.


--Thomas Pornin