From: Paul M Foster on
On Sat, Aug 14, 2010 at 10:36:07PM +0200, Sebastian Ewert wrote:

> Hi,
>
> before I allow to upload images I read them and check for several html
> tags. If they exist I don't allow the upload. Is their any need to check
> pdf files, too? At the time I'm doing this, but the result is that many
> files are denied because of unallowed html tags.

If I'm not mistaken, more recent versions of the PDF spec allow for
embedded javascript. If so, it might be worthwhile to check for
javascript in PDFs. (Whoever first thought of embedding *code* in
documents should be shot.)

Paul

--
Paul M. Foster
From: Peter Lind on
On 15 August 2010 06:14, Paul M Foster <paulf(a)quillandmouse.com> wrote:
> On Sat, Aug 14, 2010 at 10:36:07PM +0200, Sebastian Ewert wrote:
>
>> Hi,
>>
>> before I allow to upload images I read them and check for several html
>> tags. If they exist I don't allow the upload. Is their any need to check
>> pdf files, too? At the time I'm doing this, but the result is that many
>> files are denied because of unallowed html tags.
>
> If I'm not mistaken, more recent versions of the PDF spec allow for
> embedded javascript. If so, it might be worthwhile to check for
> javascript in PDFs. (Whoever first thought of embedding *code* in
> documents should be shot.)
>

I personally wouldn't bother: it is the responsibility of Adobe Reader
or whichever pdf reader a user is using, to make sure that nothing
evil comes of viewing a pdf. There's very little chance you'll be able
to properly check pdfs serverside for the various security exploits
they may contain - the pdf reader would/should be much better equipped
to do this (the fact that Adobe has failed miserably at it so far is
another thing).

Sebastian, I personally think the best check for validity is, taking
images as an example, opening the image using Imagick or something
like it. After opening, verify that the image has valid dimensions and
type: a string of javascript or something like it simply won't
validate as an image. I've typically used
http://dk2.php.net/manual/en/function.getimagesize.php for this
myself, as there isn't a lot of overhead with that function - I don't
know if Imagick would be faster though, you'd have to check.

Regards
Peter

--
<hype>
WWW: http://plphp.dk / http://plind.dk
LinkedIn: http://www.linkedin.com/in/plind
BeWelcome/Couchsurfing: Fake51
Twitter: http://twitter.com/kafe15
</hype>
From: Ashley Sheridan on
On Sun, 2010-08-15 at 08:43 +0200, Peter Lind wrote:

> On 15 August 2010 06:14, Paul M Foster <paulf(a)quillandmouse.com> wrote:
> > On Sat, Aug 14, 2010 at 10:36:07PM +0200, Sebastian Ewert wrote:
> >
> >> Hi,
> >>
> >> before I allow to upload images I read them and check for several html
> >> tags. If they exist I don't allow the upload. Is their any need to check
> >> pdf files, too? At the time I'm doing this, but the result is that many
> >> files are denied because of unallowed html tags.
> >
> > If I'm not mistaken, more recent versions of the PDF spec allow for
> > embedded javascript. If so, it might be worthwhile to check for
> > javascript in PDFs. (Whoever first thought of embedding *code* in
> > documents should be shot.)
> >
>
> I personally wouldn't bother: it is the responsibility of Adobe Reader
> or whichever pdf reader a user is using, to make sure that nothing
> evil comes of viewing a pdf. There's very little chance you'll be able
> to properly check pdfs serverside for the various security exploits
> they may contain - the pdf reader would/should be much better equipped
> to do this (the fact that Adobe has failed miserably at it so far is
> another thing).
>
> Sebastian, I personally think the best check for validity is, taking
> images as an example, opening the image using Imagick or something
> like it. After opening, verify that the image has valid dimensions and
> type: a string of javascript or something like it simply won't
> validate as an image. I've typically used
> http://dk2.php.net/manual/en/function.getimagesize.php for this
> myself, as there isn't a lot of overhead with that function - I don't
> know if Imagick would be faster though, you'd have to check.
>
> Regards
> Peter
>
> --
> <hype>
> WWW: http://plphp.dk / http://plind.dk
> LinkedIn: http://www.linkedin.com/in/plind
> BeWelcome/Couchsurfing: Fake51
> Twitter: http://twitter.com/kafe15
> </hype>
>


If you're that worried about PDF's, then maybe you could run them
through Clam via an exec() call. I believe a lot of the pdf holes have
been picked up by the antivirus groups out there, as Adobe does seem to
be a bit slow to plug them.

Thanks,
Ash
http://www.ashleysheridan.co.uk


From: Sebastian on
OK THX to everyone. I will check the images with imagick and let the
pdfs in adobes responsibility. One worry less.
From: Ashley Sheridan on
On Sun, 2010-08-15 at 11:51 +0200, Sebastian wrote:

> OK THX to everyone. I will check the images with imagick and let the
> pdfs in adobes responsibility. One worry less.


Also, if you're really worried, try suggesting people use an alternative
pdf reader. There are quite a few to choose from, that all do well at
displaying a standard pdf. The areas they tend to lack are embedded
objects like scripts, video, etc, but those don't really (imho) belong
in a pdf anyway.

Thanks,
Ash
http://www.ashleysheridan.co.uk