Question about index usage in a query [Programming SQL]

Prev: select statement ,sum with calculation
Next: Return a specific string based on positions of Y/N data

From: Gert-Jan Strik on 5 Dec 2009 07:23

Ron,

In case it wasn't already clear from Jay's response: the clustered index
is seeked because the predicate ApplicationID=99 uses the leading column
of this index, and because it is the clustered index, the index entries
include the rest of the row data, so they include INDEX6.

What might not be clear when you read the query plan, is that the index
seek is only used to locate the first row (with ApplicationID=99). From
that point on, it will scan through the index to validate your other
predicate (INDEX6 LIKE '%W%'). It will do this until ApplicationID > 99
or until 5000 matching rows are found.

--
Gert-Jan
SQL Server MVP

Ron wrote:
>
> Hi, yes, here is the full query, exec plan (SHOWPLAN_TEXT), table and Index.
> (I tried to reduce it down to an example for simplicity before, and fell into
> the trap of providing less than useful info). I've tried to lay this out as
> best as possible....
> Just to repeat, my question is why is there a clustered index seek, when
> index6 isn't in the clustered index (and is not SARGable?). Any thoughts
> greatly appreciated.
>
> ----Execution Plan----
>
> |--Top(TOP EXPRESSION:((5000)))
> |--Clustered Index Seek(OBJECT:([_archive].[dbo].[Docs].[IX_DOCUMENTS]
> AS [D]), SEEK:([D].[ApplicationID]=(99)),
> WHERE:([dbtest].[dbo].[Docs].[INDEX6] as [D].[INDEX6] like '%W%') ORDERED
> FORWARD)
>
> ----Query
>
> SELECT TOP 5000 [DocumentID], [ImportDate] FROM Docs AS D
> WHERE APPLICATIONID=99
> AND D.INDEX6 LIKE '%W%'
>
> ----Index
>
> CREATE CLUSTERED INDEX [IX_DOCUMENTS] ON [dbo].[Docs]
> (
> [ApplicationID] ASC,
> [IsTest] ASC,
> [BatchID] ASC,
> [ImportDate] DESC
> )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB =
> OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS
> = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
> GO
>
> ----Table
>
> CREATE TABLE [dbo].[Docs](
> [ID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
> [DocumentID] [uniqueidentifier] NOT NULL,
> [ApplicationID] [smallint] NOT NULL,
> [BatchID] [int] NOT NULL,
> [ImportDate] [datetime] NULL,
> [ValidFrom] [datetime] NOT NULL,
> [ValidTo] [datetime] NULL,
> [IsTest] [bit] NOT NULL,
> [DocumentName] [varchar](255) NOT NULL,
> [Extension] [varchar](50) NOT NULL,
> [IsCompressed] [bit] NOT NULL,
> [IsAttachmentOnly] [bit] NOT NULL,
> [Document] [image] NULL,
> [INDEX1] [varchar](255) NULL,
> [INDEX2] [varchar](255) NULL,
> [INDEX3] [varchar](255) NULL,
> [INDEX4] [varchar](255) NULL,
> [INDEX5] [varchar](255) NULL,
> [INDEX6] [varchar](255) NULL,
> [INDEX7] [varchar](255) NULL,
> [INDEX8] [varchar](255) NULL,
> [INDEX9] [varchar](255) NULL,
> [INDEX10] [varchar](255) NULL,
> [INDEX11] [varchar](255) NULL,
> [INDEX12] [varchar](255) NULL,
> [INDEX13] [varchar](255) NULL,
> [INDEX14] [varchar](255) NULL,
> [INDEX15] [varchar](255) NULL,
> [UsageCount] [int] NOT NULL,
> [PageJump] [int] NULL,
> [DocumentSize] [int] NULL,
> [IsApproved] [bit] NOT NULL,
> [CustomProperties] [xml] NOT NULL,
> [IsEnabled] [bit] NOT NULL,
> [msrepl_tran_version] [uniqueidentifier] NOT NULL,
> CONSTRAINT [PK_Docs] PRIMARY KEY NONCLUSTERED
> (
> [ID] ASC
> )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY =
> OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
> CONSTRAINT [DocumentID] UNIQUE NONCLUSTERED
> (
> [DocumentID] ASC
> )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY =
> OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
> ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

From: Jay on 5 Dec 2009 10:24

I think you just hit the nail on the head Gert-Jan and are answering the
actual question. Let me elaborate.

In the original post Ron says: "As COL3 is not in the index", which implies
the OP thinks that the data pages for the table need to be read too - which
isn't the case for a clustered index.

Clustered indexes physically change the way data is stored, in that indexes
no longer point to data pages. The data is the leaf node of the index
itself.

Do a search in BOL for "Clustered Index Structures" and "Nonclustered Index
Structures", or use these links for 2008 BOL. There are diagrams that show
you what is happening very clearly.

ms-help://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10de_0evalplan/html/26b28045-c3c2-465a-b564-bf2189e93fdc.htm
ms-help://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10de_0evalplan/html/1efeba1f-f848-4861-9af3-594e5ab3b597.htm

Jay

"Gert-Jan Strik" <sorrytoomuchspamalready(a)xs4all.nl> wrote in message
news:4B1A50D6.9945F462(a)xs4all.nl...
> Ron,
>
> In case it wasn't already clear from Jay's response: the clustered index
> is seeked because the predicate ApplicationID=99 uses the leading column
> of this index, and because it is the clustered index, the index entries
> include the rest of the row data, so they include INDEX6.
>
> What might not be clear when you read the query plan, is that the index
> seek is only used to locate the first row (with ApplicationID=99). From
> that point on, it will scan through the index to validate your other
> predicate (INDEX6 LIKE '%W%'). It will do this until ApplicationID > 99
> or until 5000 matching rows are found.
>
> --
> Gert-Jan
> SQL Server MVP
>
>
> Ron wrote:
>>
>> Hi, yes, here is the full query, exec plan (SHOWPLAN_TEXT), table and
>> Index.
>> (I tried to reduce it down to an example for simplicity before, and fell
>> into
>> the trap of providing less than useful info). I've tried to lay this out
>> as
>> best as possible....
>> Just to repeat, my question is why is there a clustered index seek, when
>> index6 isn't in the clustered index (and is not SARGable?). Any thoughts
>> greatly appreciated.
>>
>> ----Execution Plan----
>>
>> |--Top(TOP EXPRESSION:((5000)))
>> |--Clustered Index
>> Seek(OBJECT:([_archive].[dbo].[Docs].[IX_DOCUMENTS]
>> AS [D]), SEEK:([D].[ApplicationID]=(99)),
>> WHERE:([dbtest].[dbo].[Docs].[INDEX6] as [D].[INDEX6] like '%W%') ORDERED
>> FORWARD)
>>
>> ----Query
>>
>> SELECT TOP 5000 [DocumentID], [ImportDate] FROM Docs AS D
>> WHERE APPLICATIONID=99
>> AND D.INDEX6 LIKE '%W%'
>>
>> ----Index
>>
>> CREATE CLUSTERED INDEX [IX_DOCUMENTS] ON [dbo].[Docs]
>> (
>> [ApplicationID] ASC,
>> [IsTest] ASC,
>> [BatchID] ASC,
>> [ImportDate] DESC
>> )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB =
>> OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF,
>> ALLOW_ROW_LOCKS
>> = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
>> GO
>>
>> ----Table
>>
>> CREATE TABLE [dbo].[Docs](
>> [ID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
>> [DocumentID] [uniqueidentifier] NOT NULL,
>> [ApplicationID] [smallint] NOT NULL,
>> [BatchID] [int] NOT NULL,
>> [ImportDate] [datetime] NULL,
>> [ValidFrom] [datetime] NOT NULL,
>> [ValidTo] [datetime] NULL,
>> [IsTest] [bit] NOT NULL,
>> [DocumentName] [varchar](255) NOT NULL,
>> [Extension] [varchar](50) NOT NULL,
>> [IsCompressed] [bit] NOT NULL,
>> [IsAttachmentOnly] [bit] NOT NULL,
>> [Document] [image] NULL,
>> [INDEX1] [varchar](255) NULL,
>> [INDEX2] [varchar](255) NULL,
>> [INDEX3] [varchar](255) NULL,
>> [INDEX4] [varchar](255) NULL,
>> [INDEX5] [varchar](255) NULL,
>> [INDEX6] [varchar](255) NULL,
>> [INDEX7] [varchar](255) NULL,
>> [INDEX8] [varchar](255) NULL,
>> [INDEX9] [varchar](255) NULL,
>> [INDEX10] [varchar](255) NULL,
>> [INDEX11] [varchar](255) NULL,
>> [INDEX12] [varchar](255) NULL,
>> [INDEX13] [varchar](255) NULL,
>> [INDEX14] [varchar](255) NULL,
>> [INDEX15] [varchar](255) NULL,
>> [UsageCount] [int] NOT NULL,
>> [PageJump] [int] NULL,
>> [DocumentSize] [int] NULL,
>> [IsApproved] [bit] NOT NULL,
>> [CustomProperties] [xml] NOT NULL,
>> [IsEnabled] [bit] NOT NULL,
>> [msrepl_tran_version] [uniqueidentifier] NOT NULL,
>> CONSTRAINT [PK_Docs] PRIMARY KEY NONCLUSTERED
>> (
>> [ID] ASC
>> )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY =
>> OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
>> CONSTRAINT [DocumentID] UNIQUE NONCLUSTERED
>> (
>> [DocumentID] ASC
>> )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY =
>> OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
>> ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

From: Ron on 5 Dec 2009 10:34

Thanks Gert and Jay for your answers. Yes I see what you both mean, and
understand. Your time appreciated.

I suppose that when I see a 'seek' here, it doesn't necessarily mean that
there isn't still a gain to be made somewhere. Ie the 'scan' you mention on
index6 after the 'seek' on application id might benefit from a nonclustered
index on index6?(specifically if the LIKE was SARGable ie (INDEX6 LIKE 'W%'))

Any thoughts?

"Gert-Jan Strik" wrote:

> Ron,
>
> In case it wasn't already clear from Jay's response: the clustered index
> is seeked because the predicate ApplicationID=99 uses the leading column
> of this index, and because it is the clustered index, the index entries
> include the rest of the row data, so they include INDEX6.
>
> What might not be clear when you read the query plan, is that the index
> seek is only used to locate the first row (with ApplicationID=99). From
> that point on, it will scan through the index to validate your other
> predicate (INDEX6 LIKE '%W%'). It will do this until ApplicationID > 99
> or until 5000 matching rows are found.
>
> --
> Gert-Jan
> SQL Server MVP
>
>
> Ron wrote:
> >
> > Hi, yes, here is the full query, exec plan (SHOWPLAN_TEXT), table and Index.
> > (I tried to reduce it down to an example for simplicity before, and fell into
> > the trap of providing less than useful info). I've tried to lay this out as
> > best as possible....
> > Just to repeat, my question is why is there a clustered index seek, when
> > index6 isn't in the clustered index (and is not SARGable?). Any thoughts
> > greatly appreciated.
> >
> > ----Execution Plan----
> >
> > |--Top(TOP EXPRESSION:((5000)))
> > |--Clustered Index Seek(OBJECT:([_archive].[dbo].[Docs].[IX_DOCUMENTS]
> > AS [D]), SEEK:([D].[ApplicationID]=(99)),
> > WHERE:([dbtest].[dbo].[Docs].[INDEX6] as [D].[INDEX6] like '%W%') ORDERED
> > FORWARD)
> >
> > ----Query
> >
> > SELECT TOP 5000 [DocumentID], [ImportDate] FROM Docs AS D
> > WHERE APPLICATIONID=99
> > AND D.INDEX6 LIKE '%W%'
> >
> > ----Index
> >
> > CREATE CLUSTERED INDEX [IX_DOCUMENTS] ON [dbo].[Docs]
> > (
> > [ApplicationID] ASC,
> > [IsTest] ASC,
> > [BatchID] ASC,
> > [ImportDate] DESC
> > )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB =
> > OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS
> > = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
> > GO
> >
> > ----Table
> >
> > CREATE TABLE [dbo].[Docs](
> > [ID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
> > [DocumentID] [uniqueidentifier] NOT NULL,
> > [ApplicationID] [smallint] NOT NULL,
> > [BatchID] [int] NOT NULL,
> > [ImportDate] [datetime] NULL,
> > [ValidFrom] [datetime] NOT NULL,
> > [ValidTo] [datetime] NULL,
> > [IsTest] [bit] NOT NULL,
> > [DocumentName] [varchar](255) NOT NULL,
> > [Extension] [varchar](50) NOT NULL,
> > [IsCompressed] [bit] NOT NULL,
> > [IsAttachmentOnly] [bit] NOT NULL,
> > [Document] [image] NULL,
> > [INDEX1] [varchar](255) NULL,
> > [INDEX2] [varchar](255) NULL,
> > [INDEX3] [varchar](255) NULL,
> > [INDEX4] [varchar](255) NULL,
> > [INDEX5] [varchar](255) NULL,
> > [INDEX6] [varchar](255) NULL,
> > [INDEX7] [varchar](255) NULL,
> > [INDEX8] [varchar](255) NULL,
> > [INDEX9] [varchar](255) NULL,
> > [INDEX10] [varchar](255) NULL,
> > [INDEX11] [varchar](255) NULL,
> > [INDEX12] [varchar](255) NULL,
> > [INDEX13] [varchar](255) NULL,
> > [INDEX14] [varchar](255) NULL,
> > [INDEX15] [varchar](255) NULL,
> > [UsageCount] [int] NOT NULL,
> > [PageJump] [int] NULL,
> > [DocumentSize] [int] NULL,
> > [IsApproved] [bit] NOT NULL,
> > [CustomProperties] [xml] NOT NULL,
> > [IsEnabled] [bit] NOT NULL,
> > [msrepl_tran_version] [uniqueidentifier] NOT NULL,
> > CONSTRAINT [PK_Docs] PRIMARY KEY NONCLUSTERED
> > (
> > [ID] ASC
> > )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY =
> > OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
> > CONSTRAINT [DocumentID] UNIQUE NONCLUSTERED
> > (
> > [DocumentID] ASC
> > )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY =
> > OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
> > ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
> .
>

From: Erland Sommarskog on 5 Dec 2009 10:49

Ron (Ron(a)discussions.microsoft.com) writes:
> I suppose that when I see a 'seek' here, it doesn't necessarily mean
> that there isn't still a gain to be made somewhere. Ie the 'scan' you
> mention on index6 after the 'seek' on application id might benefit from
> a nonclustered index on index6?(specifically if the LIKE was SARGable ie
> (INDEX6 LIKE 'W%'))

First of all, if you have an index on INDEX6 and do

SELECT ... FROM tbl WHERE INDEX6 LIKE '%W%'

you may actually see an Index Seek on this index. Of course, that is
no real index seek, the predicate includes a RangeSeek function, and in
practice it is a scan.

For the query:

SELECT TOP 5000 [DocumentID], [ImportDate] FROM Docs AS D
WHERE APPLICATIONID=99
AND D.INDEX6 LIKE 'W%'

If we have a clustered index on APPLICATION and add an NC-index on INDEX6,
will that help? Maybe. The optimizer will have a choice and look at
statistics. If it estimates that there are pitiful rows with ID = 99,
it will of course use the clustered index. On the other hand, if there are
few rows with INDEX6 starting with W, this is where the optimizer will
go.

If they are few of both, I would expect the optimizer to use the non-
clustered index, since it's faster to scan. (Recall that the indexes
in the clustered index is in the NC-index too.) But if there are
many of both, I would expect the clustered index to be used, since
then there is no over head to read the data rows.

--
Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

From: Gert-Jan Strik on 5 Dec 2009 12:31

Erland,

I am not sure if you made a typo, or have had different experiences, but
I would NEVER expect to see a clustered index SEEK for the query

> SELECT ... FROM tbl WHERE INDEX6 LIKE '%W%'

because the pattern starts with a '%'. As you know, this disqualifies an
index seek.

Ron,

If you are running SQL Server 2005 or later, and the index on INDEX6
supports String Summary Statistics (which you can verify with the DBCC
SHOW_STATISTICS if you look at the String Index value in the
STAT_HEADER), then an index on INDEX6 might be beneficial. But it will
only be beneficial if the number of qualifying rows is very low for
INDEX6 LIKE '%W%', the number of qualifying rows for ApplicationID=99 is
very high and the rows are very wide (many bytes of storage).

If the optimizer then decides to use the index on INDEX6, then it will
be scanned (full scan). So there is a very low probability that it will
help performance. But of course there is only one way to find out: test
it!

--
Gert-Jan
SQL Server MVP

Erland Sommarskog wrote:
>
> Ron (Ron(a)discussions.microsoft.com) writes:
> > I suppose that when I see a 'seek' here, it doesn't necessarily mean
> > that there isn't still a gain to be made somewhere. Ie the 'scan' you
> > mention on index6 after the 'seek' on application id might benefit from
> > a nonclustered index on index6?(specifically if the LIKE was SARGable ie
> > (INDEX6 LIKE 'W%'))
>
> First of all, if you have an index on INDEX6 and do
>
> SELECT ... FROM tbl WHERE INDEX6 LIKE '%W%'
>
> you may actually see an Index Seek on this index. Of course, that is
> no real index seek, the predicate includes a RangeSeek function, and in
> practice it is a scan.
>
> For the query:
>
> SELECT TOP 5000 [DocumentID], [ImportDate] FROM Docs AS D
> WHERE APPLICATIONID=99
> AND D.INDEX6 LIKE 'W%'
>
> If we have a clustered index on APPLICATION and add an NC-index on INDEX6,
> will that help? Maybe. The optimizer will have a choice and look at
> statistics. If it estimates that there are pitiful rows with ID = 99,
> it will of course use the clustered index. On the other hand, if there are
> few rows with INDEX6 starting with W, this is where the optimizer will
> go.
>
> If they are few of both, I would expect the optimizer to use the non-
> clustered index, since it's faster to scan. (Recall that the indexes
> in the clustered index is in the NC-index too.) But if there are
> many of both, I would expect the clustered index to be used, since
> then there is no over head to read the data rows.
>
> --
> Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se
>
> Links for SQL Server Books Online:
> SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
> SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
> SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

First | Prev | Next | Last
Pages: 1 2 3 4 5
Prev: select statement ,sum with calculation
Next: Return a specific string based on positions of Y/N data