From: Stephen Hansen on
On 7/8/10 6:20 AM, Victor Subervi wrote:
> However, I now have another error. Here is my current command:
>
> cursor.execute("insert into personalDataKeys (Store, User,
> useFirstName, useLastName, usePhone, useCell, useFax, useAddress,
> useShippingAddress, useDOB, useEmail, usePW) values (%s, %s, %s, %s, %s,
> %s, %s, %s, %s, %s, %s, %s)", ([store, user] + col_vals))

Quick point: why the parens around [store, user] + col_vars? They're
redundant.

>
> I get this error from MySQL which I am having a hard time understanding:
>
> LATEST FOREIGN KEY ERROR
> ------------------------
> 100708 6:15:01 Transaction:
> TRANSACTION 0 9382, ACTIVE 0 sec, process no 5326, OS thread id
> 1169992000 inserting, thread declared inside InnoDB 500
> mysql tables in use 1, locked 1
> 3 lock struct(s), heap size 368, undo log entries 1
> MySQL thread id 1502, query id 23700 localhost beno update
> insert into personalDataKeys (Store, User, useFirstName, useLastName,
> usePhone, useCell, useFax, useAddress, useShippingAddress, useDOB,
> useEmail, usePW) values ('specialty', 'patients', 1, 1, 1, 1, 1, 1, 0,
> 1, 1, 1)
> Foreign key constraint fails for table `test/personalDataKeys`:
> ,
> CONSTRAINT `personalDataKeys_ibfk_1` FOREIGN KEY (`Store`) REFERENCES
> `products` (`Store`)

A foreign key is a constraint, a restriction, which says that rows in
TableA ("personalDataKeys") depend on certain *matching* rows to already
exist and always be valid in TableB ("products"); the exact match is a
column they have in common ("Store").

The purpose of foreign keys is to keep data consistent. Here, it appears
as if you have established a key such that the 'store' column in your
personalDataKeys table must point to a certain row in the products table
which has a 'store' column of the exact same value.

This error message is indicating that when you do this INSERT, there is
no corresponding row in the products table.

--

Stephen Hansen
... Also: Ixokai
... Mail: me+list/python (AT) ixokai (DOT) io
... Blog: http://meh.ixokai.io/

From: John Nagle on
On 7/7/2010 11:52 AM, Stephen Hansen wrote:
> On 7/7/10 11:38 AM, Victor Subervi wrote:
>> Hi;
>> I have this code:
>>
>> sql = 'insert into personalDataKeys values (%s, %s, %s)' % (store,
>> user, ', %s'.join('%s' * len(col_vals))
>> cursor.execute(sql, col_vals)

Bad approach. Don't put actual data into an SQL statement using
string parameter substitution. Try this:

values = (store, user) + tuple(col_vals) # all values to be inserted
valuesql = ",".join(["%s"]*len(values)) # '%s,%s,%s,%s,%s,%s'
sql = "INSERT INTO personaldatakeys VALUES (" + valuesql + ")"
cursor.execute(sql, values) # execute INSERT

"valuefields" is always some number of repeats of comma-separated "%s"
Anything in "values" will be escaped properly. No SQL injection
vulnerability.

John Nagle



From: Stephen Hansen on
On 7/8/10 9:03 AM, Victor Subervi wrote:
> mysql> describe products Store;
> +-------+-------------+------+-----+---------+-------+
> | Field | Type | Null | Key | Default | Extra |
> +-------+-------------+------+-----+---------+-------+
> | Store | varchar(40) | NO | MUL | NULL | |
> +-------+-------------+------+-----+---------+-------+
> 1 row in set (0.00 sec)
>
> mysql> describe personalDataKeys Store;
> +-------+-------------+------+-----+---------+-------+
> | Field | Type | Null | Key | Default | Extra |
> +-------+-------------+------+-----+---------+-------+
> | Store | varchar(40) | NO | MUL | NULL | |
> +-------+-------------+------+-----+---------+-------+
> 1 row in set (0.00 sec)
>
> They both use innodb. They're both indexed. I was thinking after getting
> your email that maybe I'd set the varchars to different lengths, but no.

A foreign key isn't about the schema, per se; its not about the
varchar's being different lengths (as they discard trailing padding)--
its about *data*. True, if you had varchar(20) and varchar(40), then if
any string longer then 20 wouldn't ever pass -- but that's really
secondary. (That's not saying a database may refuse to accept a FK if
data types are mismatched)

If "personalDataKeys" has a foreign key connecting it to "products",
then you can't add something to personalDataKeys with store =
"specialty" unless something already exists in "products" with store =
"speciality";

> However...
>
> mysql> select * from products;
> Empty set (0.00 sec)
>
> Is it that I can't insert into personalDataKeys until I've first done so
> in products?

Yes, that's precisely what foreign keys do.

> That wasn't necessary, since personalDataKeys only
> needs to be associated with personalData, so I dropped and recreated the
> table, updating personalDataKeys foreign key to reference personalData;
> however, once again:

Are you certain this is what you want? It sounds like you may be using
foreign keys without fully understanding what they are.

Think of them like a big arrow.

If you define a foreign key in personalDataKeys, referencing
personalData, you should picture a large arrow pointing from
personalDataKeys to personalData.

It's pointing because the "constraint" created by the foreign key means,
"Every record in this table, personalDataKeys, has a column which *must*
exist in its referenced table, personalData, before that record is
allowed to be added."

A foreign key isn't just a description of a relationship: its a strict
rule, declaring that a certain field in one table *actually* refers
directly to a *specific* row in *another* table: therefore, this field
can't be allowed to be any value which doesn't exist already in that
other table.

A "primary key" lets you uniquely identify a certain row in one table. A
"foreign key" lets you identify a certain row in *another table*, that
this table ultimately depends on.

> In personalDataKeys I store which fields will be required for a given
> store as it relates to personal data. For example, if there is a
> pharmacy with users 'doctors' and 'patients', certain fields in
> personalData will be required for one but not the other, and this needs
> to be inserted into personalDataKeys.

My concern here is that you're making *columns* which are
"store-dependent", such as sometimes in one store, personalData will
have a column/field named "foo", but another store it won't use "foo"
but instead use "bar", depending on how the store itself is configured.

I'd refer you back to my previous email which described two schemes to
record "store-dependant" data: one using a separate table for each store
type, another using a generic key/value table. Having one big table with
a big mix of columns that various store configurations pick and choose
seems like a very inflexible design.

Additionally (if you do keep this design), these two tables you
described seem to make the columns that are used tied to *users*, not
*stores*. The key for personalDataKeys is (Store, User): but isn't it
the case that for a certain kind of store (i.e., a pharmacy), /all/
users in that store will have the same fields in personalData be
relevant? So shouldn't "personalDataKeys" really be "storeDataKeys"?
I.e., a configuration of the store itself of what data keys it considers
relevant.

> All of this, however, obviously
> happens before any data is actually entered into either personalData or
> products.

Yeah, the constraints and such happen before data is entered. But once
they are created, you have to actual enter data in the correct order.
The constraints enforce consistency so programmer-error can't introduce
data into the tables which is out of whack with the data layout.

--

Stephen Hansen
... Also: Ixokai
... Mail: me+list/python (AT) ixokai (DOT) io
... Blog: http://meh.ixokai.io/

From: Stephen Hansen on
On 7/7/10 11:52 AM, Stephen Hansen wrote:
> On 7/7/10 11:38 AM, Victor Subervi wrote:
>> Hi;
>> I have this code:
>>
>> sql = 'insert into personalDataKeys values (%s, %s, %s)' % (store,
>> user, ', %s'.join('%s' * len(col_vals))
>> cursor.execute(sql, col_vals)
>
> First, its always best to be explicit with insert statements. Meaning,
> don't rely on the underlining structure of a table, as in:
>
> INSERT INTO YourRandomTable VALUES ("my", "value", "here");
>
> Instead, do:
>
> INSERT INTO YourRandomTable (field1, field2, field3) VALUES ("my",
> "value", "here");

I suddenly feel a need to come back and explain *why* I make the claim
to 'best' above: the explicit naming of the fields in the INSERT,
specifically, since others have shown how to do the INSERT safely while
keeping the essentially variable number of items in the values clause.

I still would advise against that approach even if it is safe from a SQL
Injection standpoint: but for a different reason entirely, that of
long-term maintainability.

No design is perfect; no customer specification (no matter how vetted,
analyzed, and approved by stakeholders) survives implementation and
real-life usage.

If you always select specific columns in a specific order (i.e., always
SELECT this, that, other; and never SELECT *), and always insert with
your columns specified (i.e., always INSERT INTO blah (this, that,
other) and never INSERT INTO blah VALUES (..)), then it lets you adapt
your application in the future when something comes up.

Specifically, it lets you add new columns without breaking everything :)
Now, those new columns would need to either allow NULL's or have a
default value of course.

But some day down the road you can go and do an ALTER TABLE to add say,
"my_whatever" to the above, and you don't suddenly have to vet every
single piece of code which accesses that table. All the existing code
will still work: its getting the pieces of data it knows how to use. As
you need to, you can adjust that code to take into account this new
piece of data. But by making any new additions "optional" in your SQL,
and making all your other accesses explicit, it just eases migration and
maintenance in future updates.

Some may disagree, I dunno. I just find in my experience that following
that practice has saved a lot of time and effort down the road.
(Especially during migrations from old versions to new versions, and
running versions concurrently during some test-phase, etc, or rolling
back a new version if a critical bug is found: the changes made to the
database to support the new versions can safely persist without you
having to do a much more expensive / time-consuming restoration of the
database from a backup).

--

Stephen Hansen
... Also: Ixokai
... Mail: me+list/python (AT) ixokai (DOT) io
... Blog: http://meh.ixokai.io/