Criticism of a proposed stream cipher requested. [Cryptography]

Prev: A simple scheme of combining PRNGs
Next: Criticism of a proposed floating point (cs)prng requested.

From: Keith F. Lynch on 7 Jun 2010 20:47

Francois Grieu <fgrieu(a)gmail.com> wrote:
> It is not a far-fetched scenario that you receive/have a few files,
> some interesting enough that you keep all of them without examining
> each in detail, and they become part of what you encrypt. One of
> these files could be all-zeroes, perhaps by accident, or for the
> purpose attacking your cryptosystem (if you communicate with your
> adversaries, or if your adversaries manage to modify a message sent
> by a friend). Just a possibility, but one that must be considered
> in a general-purpose encryption tool.

I guess mine doesn't count as general-purpose. I never encrypt any
file without a good reason. For instance if it consists of an email
message that the senders asks me to keep confidential.

> The restriction "up to the length of the file" does not apply if
> the file is a bit longer than the key (say 16 bytes); in that case,
> after recovering S up to the length of the file, the key itself can
> easily be recovered using a classic technique that Paul Rubin was
> the first to propose in this thread [solving a system of 100 linear
> equations of 100 unknown, e.g. by plain Gaussian elimination].

This isn't clear to me. Is the claim that if I give a hundred random
bitstrings of more than a hundred bits each, and also give a bitstring
produced by exclusive-oring some subset of them together, that there's
an efficient way of determining which subset I used?

I'll agree that there's probably a unique solution. But I can't
see any way of finding it without expending a tremendous amount of
computer power. For instance you could simply try all 2^100 possible
subsets. A dedicated circuit might be able to test a billion per
second, in which case testing all of them would take about 40 trillion
years.

A more practical approach would be a "meet in the middle" attack.
Calculate the 2^50 possible subsets of the first 50, and store them,
in sorted order. That would take about two weeks (if you can write
to disk fast enough) and about a million dollars worth of disk space.
Then calculate each of the 2^50 possible subsets of the last 50,
exclusive-ored with the target string, and check each of them for
a match with one of the stored strings. Doable, but not trivial.

(I'm not using random bitstrings, but the square roots of the first
hundred primes. But unless there's something unexpectedly special
about those numbers, that doesn't gain you anything.)

Suppose I had used addition instead of exclusive or. Then wouldn't
it have been the knapsack problem, which is known to be NP-complete?
Does using exclusive or really make it enormously easier?

> In the description of the complete cryptosystem that you gave, you
> are never stirring the plaintext, so using that particular plaintext
> does not simply lead to a break.

Oh, you're right. So there's nothing wrong with a message written in
a two-character alphabet. Not that such a message is likely.

> My attack with the all-zeroes file (a so-called chosen plaintext
> attack) happens to work around that obstacle, but (at least for now)
> I fail to find anything working with any other chosen plaintext,
> much less known plaintext.

All DELs? Not that such a file is likely either.

> Since the key is the same for all files, you indeed are vulnerable
> to one of the files encrypted consisting only of zeroes; and again
> it needs NOT be big (16 bytes is ample to recover the key).

Perhaps, but this doesn't generalize to a file that starts with lots
of zeroes but then contains other text after them.

> I bought a 16-faced dice with the intention of using it for Realy
> Important keys entered in hexadecimal. That did not survive
> practice.

Why not? And how is a 16-faced die made, anyway? There's no regular
solid with exactly that number of faces. You could use an icosahedron
with faces numbered 0 through 19, and re-roll whenever any of 16
through 19 come up.

Digitizing white noise would seem to be a good way to generate lots of
random bits in a hurry.
--
Keith F. Lynch - http://keithlynch.net/
Please see http://keithlynch.net/email.html before emailing me.

From: Francois Grieu on 8 Jun 2010 02:19

On 08/06/2010 02:47, Keith F. Lynch wrote :
> Francois Grieu <fgrieu(a)gmail.com> wrote:
>> The restriction "up to the length of the file" does not apply if
>> the file is a bit longer than the key (say 16 bytes); in that case,
>> after recovering S up to the length of the file, the key itself can
>> easily be recovered using a classic technique that Paul Rubin was
>> the first to propose in this thread [solving a system of 100 linear
>> equations of 100 unknown, e.g. by plain Gaussian elimination].

> This isn't clear to me. Is the claim that if I give a hundred random
> bitstrings of more than a hundred bits each, and also give a bitstring
> produced by exclusive-oring some subset of them together, that there's
> an efficient way of determining which subset I used?

Precisely so. In the following I assume n keys bits (n=100) and m bits in each of the n bitstrings (m>n, say m=110). The unknown key bits are k[p], the n known bitstrings are b[p][q], the known exclusive-or of the subset of bitstrings is s[q] with 0<=p<n and 0<=q<m.

We have that, for each q,
s[q] = ( k[0] & b[0][q] ) ^ ( k[1] & b[1][q] ) ... ^ ( k[n-1] & b[n-1][q] )

This forms a system of m equations with n unknowns.

> I'll agree that there's probably a unique solution. But I can't
> see any way of finding it without expending a tremendous amount of
> computer power. For instance you could simply try all 2100 possible
> subsets. A dedicated circuit might be able to test a billion per
> second, in which case testing all of them would take about 40 trillion
> years.

The critical thing is that the system of equations defined above is linear. We can make an analogy between the field of real numbers with the usual operations of addition and multiplication (R,+,*) and the field of binary values with the operations XOR and AND ({0,1},^,&), also known as GF(2).

The system can be solved very efficiently by Gaussian elimination; it is even easier than for real numbers, because there is no issue with numerical stability.

> A more practical approach would be a "meet in the middle" attack.

The first reply in the thread, by Scott Fluhrer, made that remark. I tried to improve on that. Then Paul Rubin noticied that the system is linear. I then Scott Fluhrer recognized we should have seen it.

> Suppose I had used addition instead of exclusive or. Then wouldn't
> it have been the knapsack problem, which is known to be NP-complete?
> Does using exclusive or really make it enormously easier?

Yes.

(..)

>> I bought a 16-faced dice with the intention of using it for Realy
>> Important keys entered in hexadecimal. That did not survive
>> practice.

> Why not?

One good reason is that I often need at lot of random (I am a security professional), so throwing dice an keying the result is tedious. Smart Cards include a Random Number generator (originally a Pseudo RNG, nowadays a True RNG) that does the job in a snap. Nowadays it is security certified, and that sounds acceptable to all my customers, when some may (IMHO wrongly) object to the dice+eye+finger+keyboard combination.

>And how is a 16-faced die made, anyway?

Two identical pyramids with regular octogonal bases, with the octogons joined.

Francois Grieu

First | Prev |
Pages: 1 2 3 4 5 6 7
Prev: A simple scheme of combining PRNGs
Next: Criticism of a proposed floating point (cs)prng requested.