From: Lawrence D'Oliveiro on
Just been reading this article
<http://www.theregister.co.uk/2010/06/23/xxs_sql_injection_attacks_testing_remedy/>
which says that a lot of security holes are arising these days because
everybody is concentrating on unit testing of their own particular
components, with less attention being devoted to overall integration
testing.

Fair enough. But it's disconcerting to see some of the advice being offered
in the reader comments, like “force everyone to use stored procedures”, or
“force everyone to use prepared/parametrized statements”, “never construct
ad-hoc SQL queries” and the like.

I construct ad-hoc queries all the time. It really isn't that hard to do
safely. All you have to do is read the documentation—for example,
<http://dev.mysql.com/doc/refman/5.0/en/string-syntax.html>—and then write a
routine that takes arbitrary data and turns it into a valid string literal,
like this <http://www.codecodex.com/wiki/Useful_MySQL_Routines#Quoting>.

I've done this sort of thing for MySQL, for HTML and JavaScript (in both
Python and JavaScript itself), and for Bash. It's not hard to verify you've
done it correctly. It lets you easily create table-updating code like the
following, which makes it so easy to update the code to track changes in the
database structure:

sql.cursor.execute \
(
"update items set "
+
", ".join
(
tuple
(
"%(name)s = %(value)s"
%
{
"name" : field[0],
"value" : SQLString(Params.getvalue
(
"%s[%s]" % (field[1], urllib.quote(modify_id))
))
}
for field in
(
("class_name", "modify_class"),
("make", "modify_make"),
("model", "modify_model"),
("details", "modify_details"),
("serial_nr", "modify_serial"),
("inventory_nr", "modify_invent"),
("when_purchased", "modify_when_purchased"),
... you get the idea ...
("location_name", "modify_location"),
("comment", "modify_comment"),
)
)
+
(
"last_modified = %d" % int(time.time()),
)
)
+
" where inventory_nr = %s" % SQLString(modify_id)
)

From: Roy Smith on
In article <i00t2k$l07$1(a)lust.ihug.co.nz>,
Lawrence D'Oliveiro <ldo(a)geek-central.gen.new_zealand> wrote:

> I construct ad-hoc queries all the time. It really isn't that hard to do
> safely. All you have to do is read the documentation

I get worried when people talk about how easy it is to do something
safely. Let me suggest a couple of things you might not have considered:

1) Somebody is running your application (or the database server) with
the locale set to something unexpected. This might change how numbers,
dates, currency, etc, get formatted, which could change the meaning of
your constructed SQL statement.

2) Somebody runs your application with a different PYTHONPATH, which
causes a different (i.e. malicious) urllib module to get loaded, which
makes urllib.quote() do something you didn't expect.

> I've done this sort of thing for MySQL, for HTML and JavaScript (in both
> Python and JavaScript itself), and for Bash. It's not hard to verify you've
> done it correctly. It lets you easily create table-updating code like the
> following, which makes it so easy to update the code to track changes in the
> database structure:
>
> sql.cursor.execute \
> (
> "update items set "
> +
> ", ".join
> (
> tuple
> (
> "%(name)s = %(value)s"
> %
> {
> "name" : field[0],
> "value" : SQLString(Params.getvalue
> (
> "%s[%s]" % (field[1],
> urllib.quote(modify_id))
> ))
> }
> for field in
> (
> ("class_name", "modify_class"),
> ("make", "modify_make"),
> ("model", "modify_model"),
> ("details", "modify_details"),
> ("serial_nr", "modify_serial"),
> ("inventory_nr", "modify_invent"),
> ("when_purchased", "modify_when_purchased"),
> ... you get the idea ...
> ("location_name", "modify_location"),
> ("comment", "modify_comment"),
> )
> )
> +
> (
> "last_modified = %d" % int(time.time()),
> )
> )
> +
> " where inventory_nr = %s" % SQLString(modify_id)
> )
From: Owen Jacobson on
On 2010-06-24 21:02:48 -0400, Roy Smith said:

> In article <i00t2k$l07$1(a)lust.ihug.co.nz>,
> Lawrence D'Oliveiro <ldo(a)geek-central.gen.new_zealand> wrote:
>
>> I construct ad-hoc queries all the time. It really isn't that hard to do
>> safely. All you have to do is read the documentation
>
> I get worried when people talk about how easy it is to do something
> safely.

First: I agree with this. While it's definitely possible to correctly
escape a given SQL dialect under controlled conditions, it's not at all
easy to get it right, and the real world is even more unfriendly than
most people expect. Furthermore there's no reason to do it that way:
Python's DB API spec effectively requires that placeholder parameters
of *some* kind exist. Even if you feel the need to construct SQL, you
can construct it with parameters almost as easily as you can construct
it with the values baked in.

With that said...

> 2) Somebody runs your application with a different PYTHONPATH, which
> causes a different (i.e. malicious) urllib module to get loaded, which
> makes urllib.quote() do something you didn't expect.

Someone who can manipulate PYTHONPATH or otherwise add code to the
runtime environment is already in a position to hose your database,
independently of escaping-related issues. It's up to the sysadmin or
user to ensure that their environment is sane, and it's on their head
if they add broken code to a program's runtime environment.

Lawrence D'Oliveiro wrote:

> I'��ve done this sort of thing for MySQL, for HTML and JavaScript (in both
> Python and JavaScript itself), and for Bash. It's not hard to verify you've
> done it correctly. It lets you easily create table-updating code like the
> following, which makes it so easy to update the code to track changes in the
> database structure:
>
> sql.cursor.execute \
> (
> "update items set "
> +
> ", ".join
> (
> tuple
> (
> "%(name)s = %(value)s"
> %
> {
> "name" : field[0],
> "value" : SQLString(Params.getvalue
> (
> "%s[%s]" % (field[1],
> urllib.quote(modify_id))
> ))
> }
> for field in
> (
> ("class_name", "modify_class"),
> ("make", "modify_make"),
> ("model", "modify_model"),
> ("details", "modify_details"),
> ("serial_nr", "modify_serial"),
> ("inventory_nr", "modify_invent"),
> ("when_purchased", "modify_when_purchased"),
> ... you get the idea ...
> ("location_name", "modify_location"),
> ("comment", "modify_comment"),
> )
> )
> +
> (
> "last_modified = %d" % int(time.time()),
> )
> )
> +
> " where inventory_nr = %s" % SQLString(modify_id)
> )

Why would I write this when SQLAlchemy, even without using its ORM
features, can do it for me? It even uses the placeholder-generating
strategy I mentioned above, where possible.

Finally, it's worth noting that MySQL is (almost) the only mainstream
database that uses escaping for parameterization. PostgreSQL, SQL
Server, Oracle, DB2, and most other databases support parameters
natively in their communication protocols: parameters aren't injected
into the query string, but are sent separately and processed separately
within the DBMS. This neatly avoids encoding-related and
quoting-related problems entirely, and it means the type of the
parameter can be preserved if it's useful.

-o

From: Lawrence D'Oliveiro on
In message <roy-30B881.21024824062010(a)news.panix.com>, Roy Smith wrote:

> 1) Somebody is running your application (or the database server) with
> the locale set to something unexpected.

Locales are under program control, so that won't happen.

This is why I use UTF-8 encoding for everything.
From: Lawrence D'Oliveiro on
In message <2010062422432660794-angrybaldguy(a)gmailcom>, Owen Jacobson wrote:

> Why would I write this when SQLAlchemy, even without using its ORM
> features, can do it for me?

SQLAlchemy doesn't seem very flexible. Looking at the code examples
<http://www.sqlalchemy.org/docs/examples.html>, they're very procedural:
build object, then do a string of separate method calls to add data to it. I
prefer the functional approach, as in my table-update example.