From: Erwin Moller on
anime schreef:
> well, poorly formulated :)

Hi,

> 1) this is from pdf file. We know, pdf support javascript
> 2) its "header part", the next goes complete scrambled data like this
>

OK
<snipped scrambled data because it confuses my thunderbird>

>
> 3) I think, its contain a lot of obfuscated javascript code, may be in
> some unspecified way. Perhaps something else.
> OK, if it not an obfuscated javascript code, what is this?

Well, if you want to know how a PDF file is constructed you must dive
into the specification.
But the good news is that Adobe opened up the specs a few years ago.

If you want to learn more, try this:
http://en.wikipedia.org/wiki/Portable_Document_Format

Or more in-depth:
http://www.adobe.com/devnet/pdf/pdf_reference.html

Make sure you have a few days off to study that. ;-)

Good luck

Erwin Moller


>
> Thank you.
>
> -----------
>
>
> "Erwin Moller"
> <Since_humans_read_this_I_am_spammed_too_much(a)spamyourself.com> wrote in
> message news:4c20b018$0$22936$e4fe514c(a)news.xs4all.nl...
>> anime schreef:
>>> what's encoding used in this code:
>>>
>>> %PDF-1.5
>>> %�?<�
>>> 1 0 obj<</T#79p#65/#43at#61#6c#6f#67/O#75#74#6c#69n#65s 2 0
>>> R/Pa#67e#73 3 0
>>> R/Op#65n#41#63#74#69o#6e 5 0 R>>endobj
>>> 2 0 obj<</#54ype/O#75tl#69#6ee#73/C#6f#75n#74 0>>endobj
>>> 3 0 obj<</Type/#50ag#65#73/K#69ds[4 0 R]/Co#75#6e#74 1>>endobj
>>> 4 0 obj<</#54#79#70e/Pag#65/Pa#72e#6et 3 0 R/M#65#64i#61B#6fx[0 0 612
>>> 792]>>endobj
>>> 5 0 obj<</#54ype/#41cti#6fn/S/#4a#61#76a#53cr#69p#74/JS 6 0 R>>endobj
>>> 6 0 obj<</L#65#6egt#68
>>> 2709/#46#69l#74#65#72[/#46#6cat#65#44#65#63ode/#41SC#49IHe#78#44ecod#65]>>
>>>
>>> stream
>>>
>>> is this javascript encoded?
>>
>> Hi,
>>
>> Your question is poorly formulated.
>> The reader wonders:
>> 1) Where does that piece of code come from?
>> 2) What did you post excactly? Are that linenumbers? Are they in the
>> code, or were they added by the program you use to open the file in
>> question?
>> 3) What do you mean by JavaScript encoded?
>> If one encodes some file with C, Java, Perl or JavaScript, how can you
>> tell by the endresult (=output)?
>>
>> Please be more clear and provide us with some context if you want a
>> reasonable response.
>>
>> Erwin Moller
>>
>> --
>> "There are two ways of constructing a software design: One way is to
>> make it so simple that there are obviously no deficiencies, and the
>> other way is to make it so complicated that there are no obvious
>> deficiencies. The first method is far more difficult."
>> -- C.A.R. Hoare
>


--
"There are two ways of constructing a software design: One way is to
make it so simple that there are obviously no deficiencies, and the
other way is to make it so complicated that there are no obvious
deficiencies. The first method is far more difficult."
-- C.A.R. Hoare
From: Thomas 'PointedEars' Lahn on
anime wrote:

> 1) this is from pdf file. We know, pdf support javascript

That depends on the viewer application, of course, but generally PDF
documents support a scripting language that is perhaps a conforming
implementation of ECMAScript (I haven't checked its grammar yet) and that
Adobe calls "JavaScript" (it isn't).

> 2) its "header part", the next goes complete scrambled data like this
>
> x?}ZЩмёKE!ШZн?Й?-~щ╪0"R">з?3S]WЦB?
> HЙu?RJ?~│'SКюTR-у?я/Х{юmeЮi\?э-g?&??%х????g? k?ф9v
> i?<ф│г?МqJ[KUFЙ?*CЫK?9giыкя2?UL9Wмo?-?Ш?ч4JкF1К3]нB_yШ_°fR<-?EtКФ>м?|
еsZФ8VfM?Цц
> _) RMъ'ХZчX;ж┼??2tvN2я%Od%*?KЦю?E0(7z?Сж.Uз'?Ц\Р_ж-Цч
> ?уэr?'s"?RF-УMЗпнЛТ$Ге
>
> 3) I think, its contain a lot of obfuscated javascript code, may be in
> some unspecified way.

I think you do not know what you are doing, or what you are talking about.

<http://en.wikipedia.org/wiki/Portable_Document_Format>

> Perhaps something else.

Most certainly.

> OK, if it not an obfuscated javascript code, what is this?

Binary data, interpreted as (in your case) KOI8-R encoded characters.
Any other file that would not be plain text would look similar.


PointedEars
From: Ant on
"anime" wrote:

> what's encoding used in this code:

Simple hex encoding, e.g. #79 = y.

Decoded:

1 0 obj<</Type/Catalog/Outlines 2 0 R/Pages 3 0
R/OpenAction 5 0 R>>endobj
2 0 obj<</Type/Outlines/Count 0>>endobj
3 0 obj<</Type/Pages/Kids[4 0 R]/Count 1>>endobj
4 0 obj<</Type/Page/Parent 3 0 R/MediaBox[0 0 612
792]>>endobj
5 0 obj<</Type/Action/S/JavaScript/JS 6 0 R>>endobj
6 0 obj<</Length
2709/Filter[/FlateDecode/ASCIIHexDecode]>>
stream

> is this javascript encoded?

The document contains javascript in object 6, if I'm remembering the
PDF format correctly, and that stream needs to have FlateDecode (gz
deflate) and ASCIIHexDecode (hexadecimal text to ASCII) applied so as
to be readable.

All this is typical of the kind of obfuscation you see in malicious
PDFs.


From: anime on
Thanks for clear and exhaustive answer. Now its clear.

>The document contains javascript in object 6 - it's important, as we can
see, this two filters, FlateDecode and ASCIIHexDecode are both deprecated
lossy filters. What is better lossless filters (3-4 ones) for replacing this
old deprecated stuff?

Regards,
~~~~~~~~




"Ant" <not(a)home.today> wrote in message
news:0OOdnQMEZZJ8irzRnZ2dnUVZ7vSdnZ2d(a)brightview.co.uk...
> "anime" wrote:
>
> > what's encoding used in this code:
>
> Simple hex encoding, e.g. #79 = y.
>
> Decoded:
>
> 1 0 obj<</Type/Catalog/Outlines 2 0 R/Pages 3 0
> R/OpenAction 5 0 R>>endobj
> 2 0 obj<</Type/Outlines/Count 0>>endobj
> 3 0 obj<</Type/Pages/Kids[4 0 R]/Count 1>>endobj
> 4 0 obj<</Type/Page/Parent 3 0 R/MediaBox[0 0 612
> 792]>>endobj
> 5 0 obj<</Type/Action/S/JavaScript/JS 6 0 R>>endobj
> 6 0 obj<</Length
> 2709/Filter[/FlateDecode/ASCIIHexDecode]>>
> stream
>
> > is this javascript encoded?
>
> The document contains javascript in object 6, if I'm remembering the
> PDF format correctly, and that stream needs to have FlateDecode (gz
> deflate) and ASCIIHexDecode (hexadecimal text to ASCII) applied so as
> to be readable.
>
> All this is typical of the kind of obfuscation you see in malicious
> PDFs.
>
>

From: Ant on
"anime" wrote:

> "Ant" wrote:
>> The document contains javascript in object 6, if I'm remembering the
>> PDF format correctly, and that stream needs to have FlateDecode (gz
>> deflate) and ASCIIHexDecode (hexadecimal text to ASCII) applied so as
>> to be readable.

Correction: FlateDecode uses gz inflate to decompress.

> it's important, as we can see, this two filters, FlateDecode and
> ASCIIHexDecode are both deprecated lossy filters.

Pardon? Neither are lossy. Deflate is used to compress data and is
available in many application, for example PHP's gzcompress function.
The zlib library's deflate and inflate routines are widely used, so I
hardly think the method is deprecated.

Encoding data as hex strings increases the size and would only be
useful to keep any binary data in the document as plain text (or to
hide something, as appears to be the case in this example).

> What is better lossless filters (3-4 ones) for replacing this
> old deprecated stuff?

Deflate is fine for lossless compression but since you're working with
PDF's, you'll have to use whatever the standard (or Adobe) provides.
I'm not sure what this has to do with javascript.