Foiling the known-plaintext attacks [Cryptography]

Prev: How cool is this?
Next: The Winds of Change.

From: Bryan on 7 May 2010 16:19

Mok-Kong Shen wrote:
> Bryan wrote:
> > Mok-Kong Shen wrote:
> >> The main point my mine is how could one deal with the "one equation but
> >> two unknowns" issue. Mr. Maaatin, if you could answer that in place of
> >> Mr. Olson, I am grateful to you just as well.
>
> > Both Maaartin and I already answered that "issue" in the Hill-cipher
> > thread. The attacker collects enough known plaintext that the
> > equations dominate the unknowns.
>
> But istn't what you wrote entirely "vague"? What is exactly a measure
> of "dominance"? How (i.e. through which varaibles involved) does there
> come about a dominance?

What just I explained: the number of equations becomes larger than the
number of unknowns as the attacker sees more ciphertext with known
plaintext. The system is likely to be solvable as soon as the attacker
has as many equations as unknowns, but, as I just quoted Maaartin
explaining, the attacker typically wants lots of ciphertext.

> It is necessary to start from plaintext and
> ciphertext, via the equation that link them together, to trace the
> exact path of the "influence" of plaintet and ciphertext on the
> parameters of the PRNGs involved, isn't it?

So why have you not tried that already? What stops you other than your
unwillingness to put forth a serious effort?

I'll give you an example: One of the PRNG's I suggested you might look
at was a lagged Fibonacci generator. Knuth suggests one which he calls
an "additive generator" in AoCP section 3.2.2, with lags of 24 and 55.
It uses arithmetic mod 2**word_size, so it is convenient for
generating the 32-bit words of the Shen-Hill matrix. The PRNG is "not
too poor" in the sense that it passes statistical tests, so it meats
your stated requirements.

The generator has 55 words in its state, call them X_0 through X_54,
thus as the attacker we starts with 55 unknowns, and no equations. To
set up the first pair of matrices requires, if I'm reading the
proposal correctly, 16 outputs from the PRNG, called g_0 ... g_15. The
PRNG from Knuth produces these as:

g_0 = X_0 + X_31 mod 2**32
g_1 = X_1 + X_32 mod 2**32
....
g_15 = X_15 + X_46 mod 2**32

That gives us 16 new unknowns but also 16 equations, so
(number_of_equations - number_of_unknowns) is still 55.

The 16 matrix entries are used to encrypt 8 words of 32 bits, and
since the attacker has known plaintext, that gives him 8 more
equations with no more unknowns. So after the first block of 8 words,
(equations - unknowns) drops to 47.

To encrypt the next 8 words takes the next 16 outputs from the PRNG to
form Shen-Hill matrix entires, which gives us 16 new unknowns and 16
new equations. The ciphertext and known plaintext again give us 8 more
equations and no more unknowns, so after the second block of 8 words,
(equations - unknowns) is 39.

As we might expect, (equations - unknowns) drops by 8 for each 8-word
block of ciphertext with known plaintext. After 7 blocks, (equations -
unknowns) is -1, and the system is likely to be uniquely solvable.
After 100 blocks the equations outnumber the unknowns by 745, so the
system will definitely be over-constrained and solvable.

The equations describing the PRNG outputs are linear in the ring mod
2**32, and the operations in the the Shen-Hill matrix multiplication
are also mod 2**32, so all our equations are linear in same ring. The
attacker has nothing more to do than solve a system of simultaneous
linear equations.

> Could you "formally"
> demonstrate the "dominance" you meant?
>
> > Does there exist a cipher such that the number of unknowns is always
> > greater than the number of equations available to an attacker? Yes;
> > it's called the one-time pad. Your scheme is not such. Any cipher with
> > a bounded key size has a unicity distance, which is the amount of
> > ciphertext the attacker needs for the solution to be unique.
>
> > Mr. Shen, answers don't go away if you refuse to study them, and
> > they're not going to change just because you keep asking the same
> > questions over and over.
>
> It would be entirely out of my fantasy to imagine my scheme could be
> something to be compared to OTP.

Yet here you are deluding yourself into thinking the attacker can't
get enough equations to determine the unknowns.

> But the word linearity "without" any
> "qulifications" doesn't unconditionally equal to "solvability" or
> "weakness in crypto", as I wrote previously.

Have you considered trying to learn about the subject before
attempting to teach about it? Mr. Shen, you've been proposing purely
linear cryptosystems for decades.

> Every pupil learning
> elementary math in schools knows that a linear equation having two
> unknowns can't be solved or more exactly has no unique solution but
> as a rule have a large number of feasible solutions.

That's why we explained to you, over and over, that the attacker uses
additional ciphertext / known plaintext. As still quoted above, "Any
cipher with a bounded key size has a unicity distance, which is the
amount of ciphertext the attacker needs for the solution to be
unique." Did you think you have some sort of exemption?

> In the present
> case with the equation C_i = g_0 + g_1*P_i mod 2^32 g_0 and g_1
> are the unknowns and, for a particular chosen value of g_0, there is
> further an issue when P_i happens to be even. This state of affairs
> is of course trivial and well-known. But the "triviality" doesn't
> (unconditionally) translate to what one needs in the present context,
> namely to obtain g_0 and g_1, or at the very least obtain frequency
> distributions of g_0 and g_1 so as to somehow with these to attempt
> to start predicting the parameters of the PRNGs. If there is not even
> a dimmest indication of such paths to a crack, it doesn't "help" at all
> simply to "claim" that the scheme is crakable. Certainly, anything
> other than an OTP doesn't have perfect security and I would be among
> the last persons to consider/imagine the scheme to be absolutely secure.
> But the issue here is not a pure theoretical one but a practical one,
> namely to look and find where there possibly may be "concrete" paths
> (mathematical relations) that the analyst can "effectively" exploit:
> As I said, I haven't yet seen any "concrete" indications at all in
> this thread. (See also what I said concerning "dominance" above.)

Obviously, Mr. Shen, you did not see the concrete indications because
you did put forth a serious effort to understand the technical issues
people explained to you. You were offered many clues, both in this
thread and in the "Dynamic Hill cipher" thread. I suggested you start
by looking at a generator that is linear in the same field or ring as
the Shen-Hill matrix operations. We told you, repeatedly, that your
not-enough-equations argument fails as the attacker collects more
ciphertext with known plaintext.

--
--Bryan

From: Maaartin on 7 May 2010 16:35

On May 7, 10:01 pm, Mok-Kong Shen <mok-kong.s...(a)t-online.de> wrote:
> Maaartin wrote:
> > I made some progress towards cracking your cipher: I solved 10
> > randomly generated instances each using 100 ints of text, one of them
> > took three minutes and all others took less than one second. I don't
> > understand why this happens and I need to check for errors in the
> > implementation and make much more tests before I post it, so consider
> > it as a very preliminary information.
>
> I sincerely hope that there were no errors on your part. But what I
> don't yet (clearly) understand from what you wrote above is what
> you meant by "solving" in this context. I surmise that you didn't
> solve for the "parameters" of 10 different PRNGs, or did you?
>
> M. K. Shen

On May 7, 10:01 pm, Mok-Kong Shen <mok-kong.s...(a)t-online.de> wrote:
> But what I
> don't yet (clearly) understand from what you wrote above is what
> you meant by "solving" in this context. I surmise that you didn't
> solve for the "parameters" of 10 different PRNGs, or did you?

I generated the 4 int parameters and 100 ints as plaintext. I
encrypted it and gave the plaintext and the ciphertext to the solver.
The solver computed the 4 int parameters. I repeated it all 10 times.

I recommend reading the last posting by Bryan carefully, his
explanation is very nice. He made two sort of typos, it should read
"(number_of_unknowns - number_of_equations) is still 55."
and
"(unknowns - equations) drops to 47."
but it's obvious.

From: Bryan on 7 May 2010 18:17

Maaartin wrote:
> I recommend reading the last posting by Bryan carefully, his
> explanation is very nice. He made two sort of typos, it should read
> "(number_of_unknowns - number_of_equations) is still 55."
> and "(unknowns - equations) drops to 47."
> but it's obvious.

Doh! -- as Homer Simpson puts it when he does something like that.

Thanks for the correction, Maaartin. I could have proof-read that a
hundred more times and missed the reversal. I had thought about both
ways of taking that difference, then I chose one and mistakenly wrote
the other. Doh!

--
--Bryan

From: Mok-Kong Shen on 7 May 2010 18:28

Maaartin wrote:

> I generated the 4 int parameters and 100 ints as plaintext. I
> encrypted it and gave the plaintext and the ciphertext to the solver.
> The solver computed the 4 int parameters. I repeated it all 10 times.
>
>
> I recommend reading the last posting by Bryan carefully, his
> explanation is very nice. He made two sort of typos, it should read
> "(number_of_unknowns - number_of_equations) is still 55."
> and
> "(unknowns - equations) drops to 47."
> but it's obvious.

O.k. That means you are ready to accept my original offer as given
in my post of 29.04.2010 19:02, right? We could start to negotiate
a concrete contract now, i.e. discussing whether there are tiny
details that need yet to be settled.

M. K. Shen

From: Bryan on 7 May 2010 18:40

I wrote:
> Maaartin wrote:
> > I recommend reading the last posting by Bryan carefully, his
> > explanation is very nice. He made two sort of typos, it should read
> > "(number_of_unknowns - number_of_equations) is still 55."
> > and "(unknowns - equations) drops to 47."
> > but it's obvious.
>
> Doh! -- as Homer Simpson puts it when he does something like that.
>
> Thanks for the correction, Maaartin. I could have proof-read that a
> hundred more times and missed the reversal.

Which is not to say that I never recognize my own errors. In light of
Maaartin's correction, I just re-read what I wrote and saw:

| The PRNG is "not too poor" in the sense that it passes
| statistical tests, so it meats your stated requirements."

"Meats" are edible cuts of animal muscle, as sold in butcher shops and
grocery stores. Y'all know what I meant.

Doh!

--
--Bryan

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7
Prev: How cool is this?
Next: The Winds of Change.