From: Andrea Giammarchi on


> Indeed you could do it via a regexp, but that uses up quite some memory..
> Every time you do a SELECT. You can simply add a table column with the
> stripped value and let the table update itself (with an ON UPDATE ON
> INSERT trigger, which takes the input value for the itemID and strips it
> once).
>
> When doing this on inputting the value into the database, you save
> yourself the pain (and performance) of doing it on every SELECT-query.

Excuse me? Somebody suggested a PHP loop to solve a query problem and you are saying that REGEXP should not be used?
MySQL caches queries and 100 SELECT with a REGEXP will cost zero after the first one if nothing changed inside the table.

At the same time an internal REGEXP is faster than everything else has to move out and be parsed after via, probably, the same REGEXP engine. Try some bench.

This problem, imho, is a non-problem, at least not a PHP problem.

How MySQL optimizes internally REGEXPs is not PHP problem as well.

It's like to create a loop to read byte after byte because file_get_contents could be memory greedy (if you do that with 1 Gb of file you are you doing wrong in any case, logs need to be split as example) or avoid MATCH AGAINST in query if we have too many rows because of performances problem (table could be slipt as well to optimize performances) ... and these practices to avoid native solutions are a bit hilarious, imho.

Regards

_________________________________________________________________
Drag n’ drop—Get easy photo sharing with Windows Live™ Photos.

http://www.microsoft.com/windows/windowslive/products/photos.aspx
From: Ben Dunlap on
> Excuse me? Somebody suggested a PHP loop to solve a query problem and you are saying that REGEXP should not be used?
> MySQL caches queries and 100 SELECT with a REGEXP will cost zero after the first one if nothing changed inside the table.

Even if the REGEXP has to change with every query?

Performance aside, I think REGEXP() could be used here, but not in the
way you've suggested. As the OP has described his table, your regex
("^[a-zA-Z0-9]+$") won't match any rows, because all of his product
IDs have non-alphanumeric characters in them.

Suppose this table:

pk | prod_id
1 | 07-ABCD-98
2 | 98-ZCXQ-21

And now suppose the OP's scenario, where a user tries to search on
product id, but enters "07ABCD98".

If the aim is to use REGEXP() to return row 1, I suppose you could
intersperse the search string with ".?" sequences and end up with this
query:

SELECT * FROM table WHERE prod_id REGEXP '^0.?7.?A.?B.?C.?D.?9.?8$'

I think just stripping the alphanumeric characters would end up being
more flexible, though.

-Ben
From: Ben Dunlap on
> What's wrong with using the wildcards that are built into most SQL
> variants?
>
> SELECT * FROM table WHERE item_id LIKE '%#abcdef'
>
> Will select all records where the item_id field ends in '#abcdef'

That works if you know the user is always going to enter the last 7
characters of the product id, but that's not how the OP characterized
the problem. The OP talked about search strings where multiple
characters had been omitted from different parts of the product id.

Ben
From: Andrea Giammarchi on


> Even if the REGEXP has to change with every query?

Ben, it does not matter, this is not a PHP problem but a DB structure/select/insert/update problem.
Whatever REGEXP you use, a REGEXP is what you need to solve this problem, certainly not a PHP loop over each row with operations for each rows.

These things are OK if you do not know REGEXP or REGEXP MySQL syntax, but in this case you should ask for the correct REGEXP rather than talk about performances, obviously slower outside MySQL and via a runtime interpreted language as PHP is, or other solutions which aim is to end up with something that just emulate a select with REGEXP.

Did you get my point? Finally, when I say "you" I mean generally speaking :)

Regards

_________________________________________________________________
With Windows Live, you can organize, edit, and share your photos.
http://www.microsoft.com/middleeast/windows/windowslive/products/photo-gallery-edit.aspx
From: sono-io on
Thanks to everyone who has responded. After reading everyone's
response, I think I have a very simple way to solve my "problem".

Using my original example, if someone wants to find item #
4D-2448-7PS, no matter what they type in, I'll take the input, strip
out all non-alphanumeric characters to make it 4D24487PS, add the
wildcard character between each of the remaining characters like so,
4*D*2*4*4*8*7*P*S, and then do the search.

Still being new at this, it seems to be the simplest approach, or is
my thinking flawed? This also keeps me from having to add another
field in the db to search on.

BTW, this solution needs to work with any db, even ASCII files, so it
has to happen in PHP.

Thanks again,
Frank