From: Jorgen Grahn on
On Sat, 2010-06-26, Lawrence D'Oliveiro wrote:
> In message <slrni297ec.1m5.grahn+nntp(a)frailea.sa.invalid>, Jorgen Grahn
> wrote:
>
>> I thought it was well-known that the solution is *not* to try to
>> sanitize the input -- it's to switch to an interface which doesn't
>> involve generating an intermediate executable. In the Python example,
>> that would be something like os.popen2(['zcat', '-f', '--', untrusted]).
>
> That???s what I mean. Why do people consider input sanitization so hard?

I'm not sure you understood me correctly, because I advocate
*not* doing input sanitization. Hard or not -- I don't want to know,
because I don't want to do it.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
From: Jorgen Grahn on
On Fri, 2010-06-25, Nobody wrote:
> On Fri, 25 Jun 2010 12:15:08 +0000, Jorgen Grahn wrote:
>
>> I don't do SQL and I don't even understand the terminology properly
>> ... but the discussion around it bothers me.
>>
>> Do those people really do this?
>
> Yes. And then some.
>
> Among web developers, the median level of programming knowledge amounts to
> the first 3 chapters of "Learn PHP in 7 Days".
>
> It doesn't help the the guy who wrote PHP itself wasn't much better.
>
>> - accept untrusted user data
>> - try to sanitize the data (escaping certain characters etc)
>> - turn this data into executable code (SQL)
>> - executing it
>>
>> Like the example in the article
>>
>> SELECT * FROM hotels WHERE city = '<untrusted>';
>
> Yep. Search the BugTraq archives for "SQL injection". And most of those
> are for widely-deployed middleware; the zillions of bespoke site-specific
> scripts are likely to be worse.
>
> Also: http://xkcd.com/327/

Priceless!

As is often the case with xkcd, I learned something, too: there's a
widely used web application/portal/database thingy which silently
strips some characters from my input. I thought it had to do with
HTML, but it's in fact exactly the sequences "'", ')', ';' and '--'
from the comic, and a few more like '>' and undoubtedly some I haven't
noticed yet.

That is surely "input sanitization" gone horribly wrong: I enter "6--8
slices of bread", but the system stores "68 slices of bread".

>> I thought it was well-known that the solution is *not* to try to
>> sanitize the input
>
> Well known by anyone with a reasonable understanding of the principles of
> programming, but somewhat less well known by the other 98% of web
> developers.
>
>> Am I missing something?
>
> There's a world of difference between a skilled chef and the people
> flipping burgers for a minimum wage. And between a chartered civil
> engineer and the people laying the asphalt. And between what you
> probably consider a programmer and the people doing most web development.

I don't know them, so I wouldn't know ... What I would *expect* is
that safe tools are provided for them, not just workarounds so they
can keep using the unsafe tools. That's what Python did, with its
multitude of alternatives to os.system and os.popen.

Anyway, thanks. It's always nice to be able to map foreign terminology
like "SQL injection" to something you already know.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
From: Jorgen Grahn on
On Sun, 2010-06-27, Lawrence D'Oliveiro wrote:
> In message <roy-854954.20435125062010(a)news.panix.com>, Roy Smith wrote:
>
>> I recently fixed a bug in some production code. The programmer was
>> careful to use snprintf() to avoid buffer overflows. The only problem
>> is, he wrote something along the lines of:
>>
>> snprintf(buf, strlen(foo), foo);
>
> A long while ago I came up with this macro:
>
> #define Descr(v) &v, sizeof v
>
> making the correct version of the above become
>
> snprintf(Descr(buf), foo);

This is off-topic, but I believe snprintf() in C can *never* safely be
the only thing you do to the buffer: you also have to NUL-terminate it
manually in some corner cases. See the documentation.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
From: Owen Jacobson on
On 2010-06-26 22:33:57 -0400, Lawrence D'Oliveiro said:

> In message <2010062522560231540-angrybaldguy(a)gmailcom>, Owen Jacobson wrote:
>
>> It's not hard. It's just begging for a visit from the fuckup fairy.
>
> That's the same fallacious argument I pointed out earlier.

In the sense that "using correct manual escaping leads to SQL injection
vulnerabilities", yes, that's a fallacious argument on its own.
However, as sites like BUGTRAQ amply demonstrate, generating SQL
through string manipulation is a risky development practice[0]. You can
continue to justify your choice to do so however you want, and you may
even be the One True Developer capable of getting it absolutely right
under all circumstances, but I'd still reject patches that introduced a
SQLString-like function and ask that you resubmit them using the
database API's parameterization tools instead.

Assuming for the sake of discussion that your SQLString function
perfectly captures the transformation required to turn an arbitrary str
into a MySQL string literal. How do you address the following issues?

1. Other (possibly inexperienced) developers reading your source who
may not have the skills to correctly implement the same transform
correctly learn from your programs that writing your own query munger
is okay.
1a. Other (possibly inexperienced) developers decide to copy and paste
your function without fully understanding how it works, in tandem with
any of the other issues below. (If you think this is rare, I invite you
to visit stackoverflow or roseindia some time.)

2. MySQL changes the quoting and escaping rules to address a
bug/feature request/developer whim, introducing a new set of corner
cases into your function and forcing you to re-learn the escaping and
quoting rules. (For people using DB API parameters, this is a matter of
upgrading the DB adapter module to a version that supports the modified
rules.)

3. You decide to switch from MySQL to a more fully-featured RDBMS,
which may have different quoting and escaping rules around string
literals.
3a. *Someone else* decides to port your program to a different RDBMS,
and may not understand that SQLString implements MySQL's quoting and
escaping rules only.

4. MySQL AB finally get off their collective duffs and adds real
parameter separation to the MySQL wire protocol, and implements real
prepared statements to massive speed gains in scenarios that are
relevant to your interests; string-based query construction gets left
out in the cold.
4a. As with case 3, except that instead of the rules changing when you
move to a new RDBMS, it's the relative performance of submitting new
queries versus reusing a parameterized query that changes.

On top of the obvious issue of completely avoiding quoting bugs, using
query parameters rather than escaping and string manipulation neatly
saves you from having to address any of these problems (and a multitude
of others) -- the DB API implementation will handle things for you, and
you are propagating good practice in an easy-to-understand form.

I am honestly at a loss trying to understand your position. There is a
huge body of documentation out there about the weaknesses of
string-manipulation-based approaches to query construction, and the use
of query parameters is so compellingly the Right Thing that I have a
very hard time comprehending why anyone would opt not to use it except
out of pure ignorance of their existence. Generating executable code --
including SQL -- from untrusted user input introduces an large
vulnerability surface for very little benefit.

You don't handle function parameters by building up python-language
strs containing the values as literals and eval'ing them, do you?

-o

[0] If you want to be *really* pedantic, string-manipulation-based
query construction is strongly correlated with the occurrence of SQL
injection vulnerabilities and bugs, which is in turn not strongly
correlated with very many other practices. Happy?

From: Kushal Kumaran on
On Sun, Jun 27, 2010 at 5:16 PM, Lawrence D'Oliveiro
<ldo(a)geek-central.gen.new_zealand> wrote:
> In message <mailman.2184.1277626565.32709.python-list(a)python.org>, Kushal
> Kumaran wrote:
>
>> On Sun, Jun 27, 2010 at 9:47 AM, Lawrence D'Oliveiro
>> <ldo(a)geek-central.gen.new_zealand> wrote:
>>
>>> In message <roy-854954.20435125062010(a)news.panix.com>, Roy Smith wrote:
>>>
>>>> I recently fixed a bug in some production code.  The programmer was
>>>> careful to use snprintf() to avoid buffer overflows.  The only problem
>>>> is, he wrote something along the lines of:
>>>>
>>>> snprintf(buf, strlen(foo), foo);
>>>
>>> A long while ago I came up with this macro:
>>>
>>> #define Descr(v) &v, sizeof v
>>>
>>> making the correct version of the above become
>>>
>>> snprintf(Descr(buf), foo);
>>
>> Not quite right.  If buf is a char array, as suggested by the use of
>> sizeof, then you're not passing a char* to snprintf.
>
> What am I passing, then?

Here's what gcc tells me (I declared buf as char buf[512]):
sprintf.c:8: warning: passing argument 1 of ‘snprintf’ from
incompatible pointer type
/usr/include/stdio.h:363: note: expected ‘char * __restrict__’ but
argument is of type ‘char (*)[512]’

You just need to lose the & from the macro.

--
regards,
kushal