From: Lasse Reichstein Nielsen on
Johannes Baagoe <baagoe(a)baagoe.com> writes:

> Suppose the very worst case : you know absolutely nothing about the
> plaintext, except that it is supposed to be read and understood by humans.
>
> What simple property would still quite decisively set it apart from the
> decryption with a wrong key ? How could you test that property by program ?

If it's text - then that's actually quite a hint. The plaintext will
then be in the same encoding (likely ASCII, UTF-8 or UC16), which is
easily detected as diffent from random bits.

That's why you should always zip before you encrypt.
On the other hand, if the receiver knows that you are zipping, he can
just unzip too.

/L
--
Lasse Reichstein Holst Nielsen
'Javascript frameworks is a disruptive technology'

From: Garrett Smith on
Mike Duffy wrote:
> Johannes Baagoe <baagoe(a)baagoe.com> wrote in
> news:UKadnXUPDY5TpnHWnZ2dnUVZ8mdi4p2d(a)giganews.com:
>
>> Garrett Smith :
[...]
>
> Of course, you do not need javascript; just tell your friends to go to
>
> http://what.ever.com/secret_xyz.html and don't tell anyone else.

A resource that is merely hidden does not have restricted access by
javascript or otherwise.

Jonannes Baagoe and Stockton are discussing encrypting strings and using
javascript to run decrpytion algorithms. That is a bit different. The
technique should be applicable also to anything with data uri, such as
images.

For purpose of the FAQ entry, I have shifted the focus on javascript
being used to restrict access to a web resource.

Discussions of cryptography are interesting but seem a bit too much for
the FAQ. If anyone feels otherwise, and feels that he is qualified to
draft an entry about that (I am not), then state so and do so.
--
Garrett
comp.lang.javascript FAQ: http://jibbering.com/faq/
From: nick on
On May 13, 4:41 pm, Johannes Baagoe <baa...(a)baagoe.com> wrote:

> The question is now: is it sufficiently interesting to justify the
> rather boring task of turning into a proper application ?
> A very small one, admittedly, but it still requires work.

Quite interesting, if I understand it right. Seems to be a vast
improvement over this sort of technique...

http://www.evolt.org/ajax_login_system

From: Ry Nohryb on
On May 13, 10:41 pm, Johannes Baagoe <baa...(a)baagoe.com> wrote:
> Ry Nohryb :
>
> > I'd need to know a way to validate decrypt(crypt, key). Unless
> > encrypt(decrypt(crypt, wrongKey), wrongkey) !== crypt, and,
> > encrypt(decrypt(crypt, theRightKey), theRightKey) === crypt.
>
> There is an easier way: decryption with a wrong key will always
> return pseudo-random numbers, that is, with an approximately uniform
> distribution. Whereas humanly readable text will always have quite
> distinctive structure, often more or less according to Zipf's law,
> and always characteristic of the relevant encoding, be it ASCII, UTF-8,
> ISO 8859 or whatever. So even in the extraordinary case where we can't
> guess the encoding, it is simply a matter of using a goodness-of-fit
> test of the observed byte frequencies against the null hypothesis
> that the distribution is uniform, which will be the case unless we
> have found the key. Chi-square serves well and is quite fast.

How would you write that test ?

I've done one that checks that the percentage of "human chars" in a
given text is above a threshold %, and it seems to work quite well :

http://gist.github.com/400957

var humanityAcum= 0;
var humanityCount= 0;
var thisHumanity= 0;
var maxHumanity= 0;
var minHumanity= 1;

function isHumanTxt (txt) {
var humanChars= "abcdefghijklmnopqrstuvwxyz0123456789.,-><+-[]
¿?!'\"()= *";
var txt= txt.replace(/\r\n/g, "").toLowerCase();
var i= txt.length;
var ctr= 0;
while (i--) ctr+= (humanChars.indexOf(txt[i]) >= 0);
humanityAcum+= (thisHumanity= ctr/txt.length);
if (thisHumanity > maxHumanity) maxHumanity= thisHumanity;
if (thisHumanity < minHumanity) minHumanity= thisHumanity;
humanityCount++;
return (thisHumanity) > 0.95;
}

> (Of course, that won't work if the plain text has been compressed,
> e.g., by LZW, *before* being encrypted - always a sensible thing to
> do. But inspection of the code shows that it is not the case.)

Of course.

> > BTW, the rate I told you (180KHz) was per 30 seconds, not per second.
>
> OK, that means it would take more than thousand years to try all
> combinations - at present speed, which of course doesn't make sense.
>
> But anyway - brute force is out, at least with ordinary means.

Except for short passwords.

> And if AES is seriously flawed, it would be a major sensation, all the
> very competent people who have given it their approval, both academics
> and in the NSA and other agencies, would look very foolish indeed -
> I don't think so.
>
> I am slightly less happy with the mode of operation. CTR would not be
> my choice, but I was too lazy to rewrite the function in CBC or OCB,
> and anyway, it is probably good enough. I should check the details,
> though, therein lies the devil.
>
> By and large, I believe I have proved my point. Here is the key:
>
>   Riekei1e (the next to last character is a one, not an el)

In my test that would have been the vector [4,49,35,48,35,49,54,35] ,
or the test # 4 + (49*62) + (35*62^2) + (48*62^3) + (35*62^4) +
(49*62^5) + (54*62^6) + (35*62^7) = 126.369.143.196.670 if starting
from [0,0,0,0,0,0,0,0] and going up. IOW, 667 years @ 6000 keys per
second. LOL.

> The question is now: is it sufficiently interesting to justify the
> rather boring task of turning into a proper application ?
> A very small one, admittedly, but it still requires work.

No. The idea is clear enough. There's not much else to say or write or
prove about it, I think.
--
Jorge.
From: Johannes Baagoe on
Johannes Baagoe :

>> it is simply a matter of using a goodness-of-fit test of the
>> observed byte frequencies against the null hypothesis that the
>> distribution is uniform, which will be the case unless we have
>> found the key. Chi-square serves well and is quite fast.

Ry Nohryb :

> How would you write that test ?

In the specific case involved, `decrypt` returns a String, whose
charCodeAt method returns Numbers in [0, 65535] for any i between
0 and the length of the String minus 1. (In most other languages,
one would rather count bytes than unsigned 16-bit char codes.)

So we initialise an Array (let us call it `counts`) of 65536 zeros,
and loop over the decrypted String counting character codes :

for (var i = plain.length - 1; i >= 0; i--) {
counts[plain.charCodeAt(i)]++;
}

(In real applications, this would be done incrementally, as the
decryption progresses. One may also stop before it is ended, once
a sufficiently large substring for the test has been decrypted.)

If the count distribution is close to uniform, each count will be
approximately equal to the length divided by 65536. Let us call that
value `expected`, and calculate Pearson's chi-square statistic :

var expected = plain.length / 65536;
var chi2 = 0;
for (var i = 0; i < 65536; i++) {
var diff = counts[i] - expected;
chi2 += diff * diff / expected;
}

In the case of an unsuccessful decryption, the result will be
reasonably close to 65536. On successful decryption, it will be *much*
higher, say, above 70000 - the precise threshold that should ring
a bell depends on how much time you are prepared to spend in human
inspection of false positives vs. how much you want to be sure not to
miss the real solution. It also depends on the encoding, for ASCII,
you may be quite sure that all the counts above 127 will be 0, and
choose a corresponding very high test value - or a simpler test.
(A chi2 much *lower* that 65536, say, less than 60000, is either
a bug or a miracle.)

Scrupulous minds will note that Pearson's test may not be appropriate,
if for not nothing else, then because `expected` will be less than
5 for short plaintexts. In actual practice, it doesn't matter -
the test discriminates quite nicely, and it is faster than more
theoretically reputable ones.

--
Johannes