From: Rich on
I need to select the top 3 rows per each group (xID -- order by xID). No
problem here. But I need to select the count of rows in each group before a
null value is encountered in the stat column -- ordering by xID

CREATE TABLE #tmp1(rowID int Identity(1,1), xID int, x1 int, stat varchar(1))

INSERT INTO #tmp1
SELECT 1, 1, 'f'
UNION ALL SELECT 1, 2, 'f'
UNION ALL SELECT 1, 3, 'f'
UNION ALL SELECT 1, 4, 'f'
UNION ALL SELECT 2, 1, 'f'
UNION ALL SELECT 2, 2, null
UNION ALL SELECT 2, 3, 'f'
UNION ALL SELECT 2, 4, 'f'
UNION ALL SELECT 3, 1, 'f'
UNION ALL SELECT 3, 2, 'f'
UNION ALL SELECT 3, 3, null

Here is what I have for getting the top 3 rows per group:

SELECT * FROM #tmp1 t1 WHERE EXISTS
(SELECT * FROM (SELECT TOP 3 * FROM #tmp1 t2 WHERE t2.xId = t1.xId ORDER BY
t2.xid) t3 WHERE t3.rowID = t1.rowID). This returns:


rowID xID x1 stat
1 1 1 f
2 1 2 f
3 1 3 f
5 2 1 f
6 2 2 NULL
7 2 3 f
9 3 1 f
10 3 2 f
11 3 3 NULL

From here I need to get this output

xID rowcount stat
1 3 f
2 1 f
3 2 f

xID = 1 had 3 rows (no null values encountered)
xID = 2 had 1 row before encountering a null
xID = 3 had 2 rows before encountering a null

Is it possible to get this output from the source data without looping?

Thanks,
Rich
From: Roy Harvey (SQL Server MVP) on
SELECT A.xID, MAX(A.rownum), A.stat
FROM (SELECT *,
row_number() over (PARTITION BY xID
ORDER BY xID, rowID) as rownum
FROM #tmp1) as A
JOIN (SELECT *,
row_number() over (PARTITION BY xID
ORDER BY xID, rowID) as rownum
FROM #tmp1
WHERE stat IS NOT NULL) as B
ON A.rowID = B.rowID
AND A.rownum = B.rownum
WHERE A.rownum <= 3
GROUP BY A.xid, A.stat

xID stat
----------- -------------------- ----
1 3 f
2 1 f
3 2 f

The idea is that we number the rows once including the NULLs, and once
without the NULLs, and only take the ones where the numbers match.

Roy Harvey
Beacon Falls, CT


On Tue, 22 Jul 2008 08:38:01 -0700, Rich
<Rich(a)discussions.microsoft.com> wrote:

>I need to select the top 3 rows per each group (xID -- order by xID). No
>problem here. But I need to select the count of rows in each group before a
>null value is encountered in the stat column -- ordering by xID
>
>CREATE TABLE #tmp1(rowID int Identity(1,1), xID int, x1 int, stat varchar(1))
>
>INSERT INTO #tmp1
>SELECT 1, 1, 'f'
>UNION ALL SELECT 1, 2, 'f'
>UNION ALL SELECT 1, 3, 'f'
>UNION ALL SELECT 1, 4, 'f'
>UNION ALL SELECT 2, 1, 'f'
>UNION ALL SELECT 2, 2, null
>UNION ALL SELECT 2, 3, 'f'
>UNION ALL SELECT 2, 4, 'f'
>UNION ALL SELECT 3, 1, 'f'
>UNION ALL SELECT 3, 2, 'f'
>UNION ALL SELECT 3, 3, null
>
>Here is what I have for getting the top 3 rows per group:
>
>SELECT * FROM #tmp1 t1 WHERE EXISTS
>(SELECT * FROM (SELECT TOP 3 * FROM #tmp1 t2 WHERE t2.xId = t1.xId ORDER BY
>t2.xid) t3 WHERE t3.rowID = t1.rowID). This returns:
>
>
>rowID xID x1 stat
>1 1 1 f
>2 1 2 f
>3 1 3 f
>5 2 1 f
>6 2 2 NULL
>7 2 3 f
>9 3 1 f
>10 3 2 f
>11 3 3 NULL
>
>From here I need to get this output
>
>xID rowcount stat
>1 3 f
>2 1 f
>3 2 f
>
>xID = 1 had 3 rows (no null values encountered)
>xID = 2 had 1 row before encountering a null
>xID = 3 had 2 rows before encountering a null
>
>Is it possible to get this output from the source data without looping?
>
>Thanks,
>Rich
From: Rich on
Thank you for your reply. I forgot to mention that we are still on sql 2000
over at my place (man, we need to upgrade!). I was just reading an article
about the Row_Number function for sql 2005. I will try to translate that
back to sql 2000 - which I understand will require the use of a #tmp table
(or 2). In the meantime, can I translate the query below in one continuous
statement? Or will I have to actually create a separate #tmp table first to
make it useable for sql 2000?


"Roy Harvey (SQL Server MVP)" wrote:

> SELECT A.xID, MAX(A.rownum), A.stat
> FROM (SELECT *,
> row_number() over (PARTITION BY xID
> ORDER BY xID, rowID) as rownum
> FROM #tmp1) as A
> JOIN (SELECT *,
> row_number() over (PARTITION BY xID
> ORDER BY xID, rowID) as rownum
> FROM #tmp1
> WHERE stat IS NOT NULL) as B
> ON A.rowID = B.rowID
> AND A.rownum = B.rownum
> WHERE A.rownum <= 3
> GROUP BY A.xid, A.stat
>
> xID stat
> ----------- -------------------- ----
> 1 3 f
> 2 1 f
> 3 2 f
>
> The idea is that we number the rows once including the NULLs, and once
> without the NULLs, and only take the ones where the numbers match.
>
> Roy Harvey
> Beacon Falls, CT
>
>
> On Tue, 22 Jul 2008 08:38:01 -0700, Rich
> <Rich(a)discussions.microsoft.com> wrote:
>
> >I need to select the top 3 rows per each group (xID -- order by xID). No
> >problem here. But I need to select the count of rows in each group before a
> >null value is encountered in the stat column -- ordering by xID
> >
> >CREATE TABLE #tmp1(rowID int Identity(1,1), xID int, x1 int, stat varchar(1))
> >
> >INSERT INTO #tmp1
> >SELECT 1, 1, 'f'
> >UNION ALL SELECT 1, 2, 'f'
> >UNION ALL SELECT 1, 3, 'f'
> >UNION ALL SELECT 1, 4, 'f'
> >UNION ALL SELECT 2, 1, 'f'
> >UNION ALL SELECT 2, 2, null
> >UNION ALL SELECT 2, 3, 'f'
> >UNION ALL SELECT 2, 4, 'f'
> >UNION ALL SELECT 3, 1, 'f'
> >UNION ALL SELECT 3, 2, 'f'
> >UNION ALL SELECT 3, 3, null
> >
> >Here is what I have for getting the top 3 rows per group:
> >
> >SELECT * FROM #tmp1 t1 WHERE EXISTS
> >(SELECT * FROM (SELECT TOP 3 * FROM #tmp1 t2 WHERE t2.xId = t1.xId ORDER BY
> >t2.xid) t3 WHERE t3.rowID = t1.rowID). This returns:
> >
> >
> >rowID xID x1 stat
> >1 1 1 f
> >2 1 2 f
> >3 1 3 f
> >5 2 1 f
> >6 2 2 NULL
> >7 2 3 f
> >9 3 1 f
> >10 3 2 f
> >11 3 3 NULL
> >
> >From here I need to get this output
> >
> >xID rowcount stat
> >1 3 f
> >2 1 f
> >3 2 f
> >
> >xID = 1 had 3 rows (no null values encountered)
> >xID = 2 had 1 row before encountering a null
> >xID = 3 had 2 rows before encountering a null
> >
> >Is it possible to get this output from the source data without looping?
> >
> >Thanks,
> >Rich
>
From: Tom Cooper on
One way to do it in SQL 2000,

SELECT t1.xId, Count(*) As 'rowcount', t1.stat FROM #tmp1 t1 WHERE EXISTS
(SELECT * FROM (SELECT TOP 3 * FROM #tmp1 t2
WHERE t2.xId = t1.xId ORDER BY
t2.xid) t3 WHERE t3.rowID = t1.rowID
AND NOT EXISTS (SELECT * FROM #tmp1 t4 WHERE t3.xID = t4.xID
And t3.RowId >= t4.RowId And t4.stat Is Null))
Group By t1.xId, t1.stat
Order By t1.xId, t1.stat

Note that if you have a large amount of data, a cursor might execute faster.

Tom

"Rich" <Rich(a)discussions.microsoft.com> wrote in message
news:EA150A22-7E50-4993-B212-DE4B7F677D1E(a)microsoft.com...
> Thank you for your reply. I forgot to mention that we are still on sql
> 2000
> over at my place (man, we need to upgrade!). I was just reading an
> article
> about the Row_Number function for sql 2005. I will try to translate that
> back to sql 2000 - which I understand will require the use of a #tmp table
> (or 2). In the meantime, can I translate the query below in one
> continuous
> statement? Or will I have to actually create a separate #tmp table first
> to
> make it useable for sql 2000?
>
>
> "Roy Harvey (SQL Server MVP)" wrote:
>
>> SELECT A.xID, MAX(A.rownum), A.stat
>> FROM (SELECT *,
>> row_number() over (PARTITION BY xID
>> ORDER BY xID, rowID) as rownum
>> FROM #tmp1) as A
>> JOIN (SELECT *,
>> row_number() over (PARTITION BY xID
>> ORDER BY xID, rowID) as rownum
>> FROM #tmp1
>> WHERE stat IS NOT NULL) as B
>> ON A.rowID = B.rowID
>> AND A.rownum = B.rownum
>> WHERE A.rownum <= 3
>> GROUP BY A.xid, A.stat
>>
>> xID stat
>> ----------- -------------------- ----
>> 1 3 f
>> 2 1 f
>> 3 2 f
>>
>> The idea is that we number the rows once including the NULLs, and once
>> without the NULLs, and only take the ones where the numbers match.
>>
>> Roy Harvey
>> Beacon Falls, CT
>>
>>
>> On Tue, 22 Jul 2008 08:38:01 -0700, Rich
>> <Rich(a)discussions.microsoft.com> wrote:
>>
>> >I need to select the top 3 rows per each group (xID -- order by xID).
>> >No
>> >problem here. But I need to select the count of rows in each group
>> >before a
>> >null value is encountered in the stat column -- ordering by xID
>> >
>> >CREATE TABLE #tmp1(rowID int Identity(1,1), xID int, x1 int, stat
>> >varchar(1))
>> >
>> >INSERT INTO #tmp1
>> >SELECT 1, 1, 'f'
>> >UNION ALL SELECT 1, 2, 'f'
>> >UNION ALL SELECT 1, 3, 'f'
>> >UNION ALL SELECT 1, 4, 'f'
>> >UNION ALL SELECT 2, 1, 'f'
>> >UNION ALL SELECT 2, 2, null
>> >UNION ALL SELECT 2, 3, 'f'
>> >UNION ALL SELECT 2, 4, 'f'
>> >UNION ALL SELECT 3, 1, 'f'
>> >UNION ALL SELECT 3, 2, 'f'
>> >UNION ALL SELECT 3, 3, null
>> >
>> >Here is what I have for getting the top 3 rows per group:
>> >
>> >SELECT * FROM #tmp1 t1 WHERE EXISTS
>> >(SELECT * FROM (SELECT TOP 3 * FROM #tmp1 t2 WHERE t2.xId = t1.xId ORDER
>> >BY
>> >t2.xid) t3 WHERE t3.rowID = t1.rowID). This returns:
>> >
>> >
>> >rowID xID x1 stat
>> >1 1 1 f
>> >2 1 2 f
>> >3 1 3 f
>> >5 2 1 f
>> >6 2 2 NULL
>> >7 2 3 f
>> >9 3 1 f
>> >10 3 2 f
>> >11 3 3 NULL
>> >
>> >From here I need to get this output
>> >
>> >xID rowcount stat
>> >1 3 f
>> >2 1 f
>> >3 2 f
>> >
>> >xID = 1 had 3 rows (no null values encountered)
>> >xID = 2 had 1 row before encountering a null
>> >xID = 3 had 2 rows before encountering a null
>> >
>> >Is it possible to get this output from the source data without looping?
>> >
>> >Thanks,
>> >Rich
>>


From: Rich on
Thank you for your reply. Question: I am sure a cursor would be faster for
large data than to query large data using Exists - But would the query using
the Row_Number function (sql 2005 query) be more efficient than the cursor
(and using Exists) ? Or for this case (using large data) would the cursor be
the most efficient solution?



"Tom Cooper" wrote:

> One way to do it in SQL 2000,
>
> SELECT t1.xId, Count(*) As 'rowcount', t1.stat FROM #tmp1 t1 WHERE EXISTS
> (SELECT * FROM (SELECT TOP 3 * FROM #tmp1 t2
> WHERE t2.xId = t1.xId ORDER BY
> t2.xid) t3 WHERE t3.rowID = t1.rowID
> AND NOT EXISTS (SELECT * FROM #tmp1 t4 WHERE t3.xID = t4.xID
> And t3.RowId >= t4.RowId And t4.stat Is Null))
> Group By t1.xId, t1.stat
> Order By t1.xId, t1.stat
>
> Note that if you have a large amount of data, a cursor might execute faster.
>
> Tom
>
> "Rich" <Rich(a)discussions.microsoft.com> wrote in message
> news:EA150A22-7E50-4993-B212-DE4B7F677D1E(a)microsoft.com...
> > Thank you for your reply. I forgot to mention that we are still on sql
> > 2000
> > over at my place (man, we need to upgrade!). I was just reading an
> > article
> > about the Row_Number function for sql 2005. I will try to translate that
> > back to sql 2000 - which I understand will require the use of a #tmp table
> > (or 2). In the meantime, can I translate the query below in one
> > continuous
> > statement? Or will I have to actually create a separate #tmp table first
> > to
> > make it useable for sql 2000?
> >
> >
> > "Roy Harvey (SQL Server MVP)" wrote:
> >
> >> SELECT A.xID, MAX(A.rownum), A.stat
> >> FROM (SELECT *,
> >> row_number() over (PARTITION BY xID
> >> ORDER BY xID, rowID) as rownum
> >> FROM #tmp1) as A
> >> JOIN (SELECT *,
> >> row_number() over (PARTITION BY xID
> >> ORDER BY xID, rowID) as rownum
> >> FROM #tmp1
> >> WHERE stat IS NOT NULL) as B
> >> ON A.rowID = B.rowID
> >> AND A.rownum = B.rownum
> >> WHERE A.rownum <= 3
> >> GROUP BY A.xid, A.stat
> >>
> >> xID stat
> >> ----------- -------------------- ----
> >> 1 3 f
> >> 2 1 f
> >> 3 2 f
> >>
> >> The idea is that we number the rows once including the NULLs, and once
> >> without the NULLs, and only take the ones where the numbers match.
> >>
> >> Roy Harvey
> >> Beacon Falls, CT
> >>
> >>
> >> On Tue, 22 Jul 2008 08:38:01 -0700, Rich
> >> <Rich(a)discussions.microsoft.com> wrote:
> >>
> >> >I need to select the top 3 rows per each group (xID -- order by xID).
> >> >No
> >> >problem here. But I need to select the count of rows in each group
> >> >before a
> >> >null value is encountered in the stat column -- ordering by xID
> >> >
> >> >CREATE TABLE #tmp1(rowID int Identity(1,1), xID int, x1 int, stat
> >> >varchar(1))
> >> >
> >> >INSERT INTO #tmp1
> >> >SELECT 1, 1, 'f'
> >> >UNION ALL SELECT 1, 2, 'f'
> >> >UNION ALL SELECT 1, 3, 'f'
> >> >UNION ALL SELECT 1, 4, 'f'
> >> >UNION ALL SELECT 2, 1, 'f'
> >> >UNION ALL SELECT 2, 2, null
> >> >UNION ALL SELECT 2, 3, 'f'
> >> >UNION ALL SELECT 2, 4, 'f'
> >> >UNION ALL SELECT 3, 1, 'f'
> >> >UNION ALL SELECT 3, 2, 'f'
> >> >UNION ALL SELECT 3, 3, null
> >> >
> >> >Here is what I have for getting the top 3 rows per group:
> >> >
> >> >SELECT * FROM #tmp1 t1 WHERE EXISTS
> >> >(SELECT * FROM (SELECT TOP 3 * FROM #tmp1 t2 WHERE t2.xId = t1.xId ORDER
> >> >BY
> >> >t2.xid) t3 WHERE t3.rowID = t1.rowID). This returns:
> >> >
> >> >
> >> >rowID xID x1 stat
> >> >1 1 1 f
> >> >2 1 2 f
> >> >3 1 3 f
> >> >5 2 1 f
> >> >6 2 2 NULL
> >> >7 2 3 f
> >> >9 3 1 f
> >> >10 3 2 f
> >> >11 3 3 NULL
> >> >
> >> >From here I need to get this output
> >> >
> >> >xID rowcount stat
> >> >1 3 f
> >> >2 1 f
> >> >3 2 f
> >> >
> >> >xID = 1 had 3 rows (no null values encountered)
> >> >xID = 2 had 1 row before encountering a null
> >> >xID = 3 had 2 rows before encountering a null
> >> >
> >> >Is it possible to get this output from the source data without looping?
> >> >
> >> >Thanks,
> >> >Rich
> >>
>
>
>