From: Antti Nivala on
We use some CLR UDFs to process BLOBs. In many cases, the optimizer seems to
choose a bad plan because it guesses that our UDF will limit the results much
more than it does. For example, when we have a WHERE condition such as
OurUDF( Data ) = 1, the optimizer apparently guesses that the number of rows
will drop to one third (33%). In reality, the number of rows remains almost
the same because the UDF will return 1 for 95% to 100% of rows.

Is it possible to tell the optimizer that this UDF has poor selectivity and
that it is thus better to evaluate almost any other restricting conditions
prior to evaluating the condition that involves the UDF?

Or, as a related matter, is it possible to tell the optimizer that the
execution of the UDF is somewhat slow and thus it would be wise to optimize
for executing the UDF the smallest possible number of times?

-Antti

From: Gert-Jan Strik on
Antti Nivala wrote:
>
> We use some CLR UDFs to process BLOBs. In many cases, the optimizer seems to
> choose a bad plan because it guesses that our UDF will limit the results much
> more than it does. For example, when we have a WHERE condition such as
> OurUDF( Data ) = 1, the optimizer apparently guesses that the number of rows
> will drop to one third (33%). In reality, the number of rows remains almost
> the same because the UDF will return 1 for 95% to 100% of rows.
>
> Is it possible to tell the optimizer that this UDF has poor selectivity and
> that it is thus better to evaluate almost any other restricting conditions
> prior to evaluating the condition that involves the UDF?
>
> Or, as a related matter, is it possible to tell the optimizer that the
> execution of the UDF is somewhat slow and thus it would be wise to optimize
> for executing the UDF the smallest possible number of times?
>
> -Antti

I am afraid not. At least not in any direct way.

Depending on your query, you might be able to rewrite it. For example
using derived tables with grouping or replacing Inner Join with Outer
Join, etc. But without seeing the query, there is not much to say about
that. Also, you need good SQL skills to handle this.

Maybe it is an option to create a computed column for OurUDF(Data) and
persist it (index it). It would change the performance dynamics
(performance hit when "Data" is changed, no performance hit when
selecting from the table). It would also add statistics to the column
that the optimizer can use.

--
Gert-Jan