Why does adding a where clause make query 100 times slower? [Using MS SQL Server]

Prev: Group By Median: Custom Aggregate Function? Passing Array to SP?
Next: DBSync for MS SQL & MySQL for $39. Just one day.

From: Emin on 13 Apr 2010 09:08

Dear Experts,

I have a query of the form SELECT * FROM A WHERE X=1 which runs in
under a second and returns about 800 records. But when I add another
clause and do SELECT * FROM A WHERE X=1 AND Y=2, the query is more
than a 100 times slower and the server eventually times out.

What I find confusing is that simply taking the first query and
checking if Y=2 on every row should at most double the query time. I
guess SQL server is trying to do something clever and getting confused
in its optimizations.

I tried simple things like using a common table expression such as

WITH MYCTE AS (SELECT * FROM A WHERE X=1)
SELECT * FROM MYCTE WHERE Y=2

or nested queries like

SELECT * FROM (SELECT * FROM A WHERE X=1) STUFF
WHERE Y=2

but neither work.

Is there a way to simply tell SQL Server to just scan all the results
for the second where clause instead of trying to do something too
clever? Alternatively, do you have any suggestions on what to do in
this case?

I guess I could use a temporary table but that seems really ugly and
wasteful.

Thanks,
-Emin

From: Emin on 13 Apr 2010 10:05

One more strange effect:

If I do
SELECT * FROM A WHERE X=1 AND Y=2
OPTION (FORCE ORDER)
then things only take 5 times as long as SELECT * FROM A WHERE X=1
which is annoying but at least it runs.

But if I do
SELECT * FROM A WHERE X=1 AND Y=2
OPTION (FAST 5)
then it runs super fast.

Could someone explain what the FAST option does?

[Note that A is a table-valued function in this case]

Thanks again,
-Emin

On Apr 13, 9:08 am, Emin <emin.shop...(a)gmail.com> wrote:
> Dear Experts,
>
> I have a query of the form SELECT * FROM A WHERE X=1 which runs in
> under a second and returns about 800 records. But when I add another
> clause and do SELECT * FROM A WHERE X=1 AND Y=2, the query is more
> than a 100 times slower and the server eventually times out.
>
> What I find confusing is that simply taking the first query and
> checking if Y=2 on every row should at most double the query time. I
> guess SQL server is trying to do something clever and getting confused
> in its optimizations.
>
> I tried simple things like using a common table expression such as
>
> WITH MYCTE AS (SELECT * FROM A WHERE X=1)
> SELECT * FROM MYCTE WHERE Y=2
>
> or nested queries like
>
> SELECT * FROM (SELECT * FROM A WHERE X=1) STUFF
> WHERE Y=2
>
> but neither work.
>
> Is there a way to simply tell SQL Server to just scan all the results
> for the second where clause instead of trying to do something too
> clever? Alternatively, do you have any suggestions on what to do in
> this case?
>
> I guess I could use a temporary table but that seems really ugly and
> wasteful.
>
> Thanks,
> -Emin

From: Gert-Jan Strik on 13 Apr 2010 12:16

Emin,

The key is in the parts that you don't tell us, or only tell us in a
side note.

For example, there is a big difference between the statement

SELECT X FROM A WHERE X=1

and

SELECT * FROM A WHERE X=1 AND Y=2

The difference between these two is not just the extra predicate in the
WHERE clause, but also the column list of the resultset. The first can
be satisfied with just an index on X. The second requires all columns
and (probably) cannot be satisfied with a nonclustered index on X.

You mention that "A" is not a table, but the result of a table valued
function. That makes a big difference. TVFs do not have indexes. Only
tables do (and your occasional indexed view).

So without seeing the true query, and knowing more about the tables and
indexes that are in play, I don't think it is possible to answer your
question.

--
Gert-Jan

Emin wrote:
>
> One more strange effect:
>
> If I do
> SELECT * FROM A WHERE X=1 AND Y=2
> OPTION (FORCE ORDER)
> then things only take 5 times as long as SELECT * FROM A WHERE X=1
> which is annoying but at least it runs.
>
> But if I do
> SELECT * FROM A WHERE X=1 AND Y=2
> OPTION (FAST 5)
> then it runs super fast.
>
> Could someone explain what the FAST option does?
>
> [Note that A is a table-valued function in this case]
>
> Thanks again,
> -Emin
>
> On Apr 13, 9:08 am, Emin <emin.shop...(a)gmail.com> wrote:
> > Dear Experts,
> >
> > I have a query of the form SELECT * FROM A WHERE X=1 which runs in
> > under a second and returns about 800 records. But when I add another
> > clause and do SELECT * FROM A WHERE X=1 AND Y=2, the query is more
> > than a 100 times slower and the server eventually times out.
> >
> > What I find confusing is that simply taking the first query and
> > checking if Y=2 on every row should at most double the query time. I
> > guess SQL server is trying to do something clever and getting confused
> > in its optimizations.
> >
> > I tried simple things like using a common table expression such as
> >
> > WITH MYCTE AS (SELECT * FROM A WHERE X=1)
> > SELECT * FROM MYCTE WHERE Y=2
> >
> > or nested queries like
> >
> > SELECT * FROM (SELECT * FROM A WHERE X=1) STUFF
> > WHERE Y=2
> >
> > but neither work.
> >
> > Is there a way to simply tell SQL Server to just scan all the results
> > for the second where clause instead of trying to do something too
> > clever? Alternatively, do you have any suggestions on what to do in
> > this case?
> >
> > I guess I could use a temporary table but that seems really ugly and
> > wasteful.
> >
> > Thanks,
> > -Emin

From: Plamen Ratchev on 13 Apr 2010 12:19

Compare the execution plans of both queries and see what makes the difference. Maybe you are missing an index on the Y
column.

Using CTE or derived table makes no difference, they are expanded in the execution plan and SQL Server can push
predicates up or down the execution plan.

As you noted, one solution would be to run the first query with the X predicate and materialize the result set to a
temporary table, then use that as base to filter on the Y column.

--
Plamen Ratchev
http://www.SQLStudio.com

|
Pages: 1
Prev: Group By Median: Custom Aggregate Function? Passing Array to SP?
Next: DBSync for MS SQL & MySQL for $39. Just one day.