From: Andrew on
Hi!

I hope I can get close to Sun and avoid the flames... ;)

First of all, I have read and fully understand numerous postings on
obfuscation. Also, I understand and mostly agree with this:
http://jibbering.com/faq/obfuscate.html
I have also personally successfully decoded JS that was encoded with
http://www.javascript-source.com/, http://www.stunnix.com/ and
http://www.semdesigns.com/Products/LanguageTools/ECMAScriptTools.html.
I haven't even checked the numerous others I have found because they
look even worse. So I know how futile the resistance is. ;)

BUT: I am looking for something that would NOT (only) do the basic
comments stripping / names renaming, but also changed the (apparent,
of course) flow of the program. Sth. that would convert:
-----
sqrt=function(x) {
return(x*x);
}
-----
to sth. like this:
-----
b=function(v) {
return(a(-v));
}
a=function(x) {
if (x<4)
{
a=x+3;
b=x-7;
c=(x%a)+4;
return((b-c)*(b-c))
}
else
return(b(v));
}
-----
(no, I haven't written any unit tests or indeed run it, so it might
not be correct - but you get the idea)

I don't care much about the size of the JS file or about the execution
time (though it must be O(n), of course). To unscramble such code you
would in general need an execution optimizer, which is far from
trivial. Or you could spend a LOT of time on that 30k (original size)
js file to decode it. :)

Are there any tools out there that are capable of doing this? Writing
such code by hand is... well, difficult. ;) But writing such tools
should be doable. Anyone found something like this yet?

Please: refrain from "don't do it" posts. I just want to know what can
be done.

Thanks!
From: Jorge on
On Dec 22, 12:04 pm, Andrew <anzen...(a)volja.net> wrote:
> (...)
> BUT: I am looking for something that would NOT (only) do the basic
> comments stripping / names renaming, but also changed the (apparent,
> of course) flow of the program. Sth. that would convert:
> -----
> sqrt=function(x) {
>   return(x*x);}
>
> -----
> to sth. like this:
> -----
> b=function(v) {
>   return(a(-v));}
>
> a=function(x) {
>   if (x<4)
>   {
>     a=x+3;
>     b=x-7;
>     c=(x%a)+4;
>     return((b-c)*(b-c))
>   }
>   else
>     return(b(v));}
>
> -----
> (no, I haven't written any unit tests or indeed run it, so it might
> not be correct - but you get the idea)

Awesome idea (!). scrambler(code) ought to be !== scrambler(code) :
the output ought to be as random as possible.
--
Jorge.
From: Scott Sauyet on
On Dec 22, 6:04 am, Andrew <anzen...(a)volja.net> wrote:
> BUT: I am looking for something that would NOT (only) do the basic
> comments stripping / names renaming, but also changed the (apparent,
> of course) flow of the program. Sth. that would convert:
> -----
> sqrt=function(x) {
>   return(x*x);}
>
> -----
> to sth. like this:
> -----
> b=function(v) {
>   return(a(-v));}
>
> a=function(x) {
>   if (x<4)
>   {
>     a=x+3;
>     b=x-7;
>     c=(x%a)+4;
>     return((b-c)*(b-c))
>   }
>   else
>     return(b(v));}
>
> -----
> (no, I haven't written any unit tests or indeed run it, so it might
> not be correct - but you get the idea)

Others can try to tell you how wrong-headed the idea seems (and I
agree with them.) What I want to point out is just how difficult the
process would be. First of all, your source transformations would
presumably need to be one-way, or someone could simply reverse them to
get your code; even if the exact sequence of transformations applied
was randomized, I'm guessing it would be relatively easy to
iteratively test some reversals to see if they simplify the code,
repeating until the code seems simple enough.

And then testing that you haven't broken anything would likely be a
nightmare. The transformed code would almost certainly have its own
artifacts that need to be checked for edge cases.

But you can't even supply a simple working example here. When I try b
(10) in the above, I get 324. Imagine what would go into making
something that's general enough to be useful and could be tested well
enough that it's unlikely to break the scripts it transforms.

-- Scott
From: Thomas 'PointedEars' Lahn on
Andrew wrote:

> BUT: I am looking for something that would NOT (only) do the basic
> comments stripping / names renaming, but also changed the (apparent,
> of course) flow of the program. Sth. that would convert:
> -----
> sqrt=function(x) {
> return(x*x);
> }
> -----
> to sth. like this:
> -----
> b=function(v) {
> return(a(-v));
> }
> a=function(x) {
> if (x<4)
> {
> a=x+3;
> b=x-7;
> c=(x%a)+4;
> return((b-c)*(b-c))
> }
> else
> return(b(v));
> }
> -----

IIUC, Rice's theorem precludes that from being accomplished.


PointedEars
--
Danny Goodman's books are out of date and teach practices that are
positively harmful for cross-browser scripting.
-- Richard Cornford, cljs, <cife6q$253$1$8300dec7(a)news.demon.co.uk> (2004)
From: Andrew on
Wow, guys... Though I sense the underlying criticism (he, he... ;) I
must say... I'm impressed.

@Doug: thank you for your opinion. :)

@Lasse: that is exactly what I had in mind, nicely put!

I agree, of course one needs to assess if the extra cost is worth it
(of course, there is the "fun" factor in this equation too ;), but
first I wanted to know if there is a tool to do that automatically. It
seems to me it shouldn't be _too_ difficult to write one.

@Jorge: The idea is not mine - it comes from a contest where you had
to manually (as in paper & pencil) decipher what some program does and
simplify it as much as possible.

@Scott:
Yes, that would be another challenge - how to write a program that
deciphers such mangled code. :)
I would say that this is what code optimizers are doing, actually -
trying to optimize our "mangled" code.

As stated, I wrote that example out of my head in less than 2 minutes
- and it shows. Working example would be this:
-----
function b(v) {
return(a(-v));
}

function a(x) {
if (x>7)
{
var a=x+3;
var d=x-7;
c=(a%x)+4;
return((d+c)*(d+c))
}
else
return((x>0)?(x*x):(b(x)));
}

// unit test:
for (x=-100; x<100;x+=0.3)
{
if ( Math.abs(a(x)-x*x) > 0.000001 )
{
alert('Oops! '+a(x)+'!='+(x*x));
break;
}
}
if (x>=100)
alert('Test passed.');
-----
(and it still took me less than 10 minutes while making something
correct and very difficult to read).

But that is beside the point. Such process should of course be
automated and very well tested if it is to be successful.


@PointedEars:
> IIUC, Rice's theorem precludes that from being accomplished.

Could you please elaborate on that? What exactly does it preclude and
why? Surely you don't mean that the code can't be made less readable?

Thank you all for sharing!