From: tshad on
I have a query that I used that was based on the following query that Hugo
gave me yesterday (I added CheckNumber in for the problem).

I had a problem with finding duplicate check numbers in the table
(voids,re-issued checks etc). It does show in my results, but out of
4300+checks it is hard to find all the duplicates for auditing purposes
while setting up my script.

How do I go about changing my query to show ONLY duplicates (or vice versa)?

Here is the data:

drop table EmployeePay
drop table PayDetail
CREATE TABLE [dbo].[EmployeePay] (
[EmployeePayID] [int] IDENTITY (1, 1),
[ClientID] [int] Not NULL ,
[CheckNumber] [int] Not Null,
[PayDate] [smalldatetime] Not NULL,
PRIMARY KEY (EmployeePayID)
) ON [PRIMARY]
CREATE TABLE [dbo].[PayDetail] (
[PayDetailID] [int] IDENTITY (1, 1),
[ClientID] [int] Not NULL ,
[PayDate] [smalldatetime] Not NULL ,
[Code] [varchar] (5) Not NULL ,
[Hours] [int] Not NULL ,
[Amount] [money] Not NULL,
PRIMARY KEY (PayDetailID)
) ON [PRIMARY]
insert EmployeePay values (1,1234,'01-01-05')
insert EmployeePay values (1,2255,'02-03-05')
insert EmployeePay values (2,1331,'01-01-05')
insert EmployeePay values (2,3011,'03-06-05')
insert EmployeePay values (2,1331,'05-06-05')
insert PayDetail values (1,'01-01-05','5',20,200.50)
insert PayDetail values (1,'01-01-05','6',25,10.50)
insert PayDetail values (1,'01-01-05','13',50,120.25)
insert PayDetail values (1,'02-03-05','5',5,110)
insert PayDetail values (1,'02-03-05','9',18,250.50)
insert PayDetail values (2,'01-01-05','5',50,120.25)
insert PayDetail values (2,'01-01-05','44',10,320.32)
insert PayDetail values (2,'01-01-05','32',50,120.25)
insert PayDetail values (2,'03-06-05','46',18,235.75)
insert PayDetail values (2,'05-06-05','5',-50,-120.25)
insert PayDetail values (2,'05-06-05','44',-10,-320.32)
insert PayDetail values (2,'05-06-05','32',-50,-120.25)


and the script I am using:

CREATE VIEW PayDetailWithRank
AS
SELECT pd1.ClientID, pd1.PayDate, pd1.Code, pd1.Hours, pd1.Amount,
(SELECT COUNT(*)
FROM PayDetail AS pd2
WHERE pd2.ClientID = pd1.ClientID
AND pd2.PayDate = pd1.PayDate
AND pd2.Code < pd1.Code) AS Rank
FROM PayDetail AS pd1
go
SELECT e.ClientID, e.PayDate, e.CheckNumber,
d1.Code, d1.Hours, d1.Amount,
d2.Code, d2.Hours, d2.Amount,
d3.Code, d3.Hours, d3.Amount
FROM EmployeePay AS e
INNER JOIN PayDetailWithRank AS d1
ON d1.ClientID = e.ClientID
AND d1.PayDate = e.PayDate
AND d1.Rank % 3 = 0
LEFT JOIN PayDetailWithRank AS d2
ON d2.ClientID = e.ClientID
AND d2.PayDate = e.PayDate
AND d2.Rank / 3 = d1.Rank / 3
AND d2.Rank % 3 = 1
LEFT JOIN PayDetailWithRank AS d3
ON d3.ClientID = e.ClientID
AND d3.PayDate = e.PayDate
AND d3.Rank / 3 = d1.Rank / 3
AND d3.Rank % 3 = 2
ORDER BY e.ClientID, e.PayDate, d1.Rank
go
DROP VIEW PayDetailWithRank
go


This will give you 1 duplicate check. Can I change this to show only
duplicate, somehow?

Thanks,

Tom


From: tshad on
Never mind.

I figure out how. I have to modify the statement slightly

I need to change the Order by to Group By and add a "Having Count(*) > 1"
clause.

Tom

"tshad" <tscheiderich(a)ftsolutions.com> wrote in message
news:%23tVBdqhuFHA.1472(a)TK2MSFTNGP15.phx.gbl...
>I have a query that I used that was based on the following query that Hugo
>gave me yesterday (I added CheckNumber in for the problem).
>
> I had a problem with finding duplicate check numbers in the table
> (voids,re-issued checks etc). It does show in my results, but out of
> 4300+checks it is hard to find all the duplicates for auditing purposes
> while setting up my script.
>
> How do I go about changing my query to show ONLY duplicates (or vice
> versa)?
>
> Here is the data:
>
> drop table EmployeePay
> drop table PayDetail
> CREATE TABLE [dbo].[EmployeePay] (
> [EmployeePayID] [int] IDENTITY (1, 1),
> [ClientID] [int] Not NULL ,
> [CheckNumber] [int] Not Null,
> [PayDate] [smalldatetime] Not NULL,
> PRIMARY KEY (EmployeePayID)
> ) ON [PRIMARY]
> CREATE TABLE [dbo].[PayDetail] (
> [PayDetailID] [int] IDENTITY (1, 1),
> [ClientID] [int] Not NULL ,
> [PayDate] [smalldatetime] Not NULL ,
> [Code] [varchar] (5) Not NULL ,
> [Hours] [int] Not NULL ,
> [Amount] [money] Not NULL,
> PRIMARY KEY (PayDetailID)
> ) ON [PRIMARY]
> insert EmployeePay values (1,1234,'01-01-05')
> insert EmployeePay values (1,2255,'02-03-05')
> insert EmployeePay values (2,1331,'01-01-05')
> insert EmployeePay values (2,3011,'03-06-05')
> insert EmployeePay values (2,1331,'05-06-05')
> insert PayDetail values (1,'01-01-05','5',20,200.50)
> insert PayDetail values (1,'01-01-05','6',25,10.50)
> insert PayDetail values (1,'01-01-05','13',50,120.25)
> insert PayDetail values (1,'02-03-05','5',5,110)
> insert PayDetail values (1,'02-03-05','9',18,250.50)
> insert PayDetail values (2,'01-01-05','5',50,120.25)
> insert PayDetail values (2,'01-01-05','44',10,320.32)
> insert PayDetail values (2,'01-01-05','32',50,120.25)
> insert PayDetail values (2,'03-06-05','46',18,235.75)
> insert PayDetail values (2,'05-06-05','5',-50,-120.25)
> insert PayDetail values (2,'05-06-05','44',-10,-320.32)
> insert PayDetail values (2,'05-06-05','32',-50,-120.25)
>
>
> and the script I am using:
>
> CREATE VIEW PayDetailWithRank
> AS
> SELECT pd1.ClientID, pd1.PayDate, pd1.Code, pd1.Hours, pd1.Amount,
> (SELECT COUNT(*)
> FROM PayDetail AS pd2
> WHERE pd2.ClientID = pd1.ClientID
> AND pd2.PayDate = pd1.PayDate
> AND pd2.Code < pd1.Code) AS Rank
> FROM PayDetail AS pd1
> go
> SELECT e.ClientID, e.PayDate, e.CheckNumber,
> d1.Code, d1.Hours, d1.Amount,
> d2.Code, d2.Hours, d2.Amount,
> d3.Code, d3.Hours, d3.Amount
> FROM EmployeePay AS e
> INNER JOIN PayDetailWithRank AS d1
> ON d1.ClientID = e.ClientID
> AND d1.PayDate = e.PayDate
> AND d1.Rank % 3 = 0
> LEFT JOIN PayDetailWithRank AS d2
> ON d2.ClientID = e.ClientID
> AND d2.PayDate = e.PayDate
> AND d2.Rank / 3 = d1.Rank / 3
> AND d2.Rank % 3 = 1
> LEFT JOIN PayDetailWithRank AS d3
> ON d3.ClientID = e.ClientID
> AND d3.PayDate = e.PayDate
> AND d3.Rank / 3 = d1.Rank / 3
> AND d3.Rank % 3 = 2
> ORDER BY e.ClientID, e.PayDate, d1.Rank
> go
> DROP VIEW PayDetailWithRank
> go
>
>
> This will give you 1 duplicate check. Can I change this to show only
> duplicate, somehow?
>
> Thanks,
>
> Tom
>


From: John Bell on
Hi

Changing the first left outer join to be an inner join will ensure at least
one duplicate.

SELECT e.ClientID, e.PayDate, e.CheckNumber,
d1.Code, d1.Hours, d1.Amount, d1.rank,
d2.Code, d2.Hours, d2.Amount, d2.rank,
d3.Code, d3.Hours, d3.Amount, d3.rank
FROM EmployeePay AS e
JOIN PayDetailWithRank AS d1
ON d1.ClientID = e.ClientID
AND d1.PayDate = e.PayDate
AND d1.Rank % 3 = 0
JOIN PayDetailWithRank AS d2
ON d2.ClientID = e.ClientID
AND d2.PayDate = e.PayDate
AND d2.Rank / 3 = d1.Rank / 3
AND d2.Rank % 3 = 1
LEFT JOIN PayDetailWithRank AS d3
ON d3.ClientID = e.ClientID
AND d3.PayDate = e.PayDate
AND d3.Rank / 3 = d1.Rank / 3
AND d3.Rank % 3 = 2
ORDER BY e.ClientID, e.PayDate, d1.Rank

What are you expecting if there are 4 duplicates?

John

"tshad" <tscheiderich(a)ftsolutions.com> wrote in message
news:%23tVBdqhuFHA.1472(a)TK2MSFTNGP15.phx.gbl...
>I have a query that I used that was based on the following query that Hugo
>gave me yesterday (I added CheckNumber in for the problem).
>
> I had a problem with finding duplicate check numbers in the table
> (voids,re-issued checks etc). It does show in my results, but out of
> 4300+checks it is hard to find all the duplicates for auditing purposes
> while setting up my script.
>
> How do I go about changing my query to show ONLY duplicates (or vice
> versa)?
>
> Here is the data:
>
> drop table EmployeePay
> drop table PayDetail
> CREATE TABLE [dbo].[EmployeePay] (
> [EmployeePayID] [int] IDENTITY (1, 1),
> [ClientID] [int] Not NULL ,
> [CheckNumber] [int] Not Null,
> [PayDate] [smalldatetime] Not NULL,
> PRIMARY KEY (EmployeePayID)
> ) ON [PRIMARY]
> CREATE TABLE [dbo].[PayDetail] (
> [PayDetailID] [int] IDENTITY (1, 1),
> [ClientID] [int] Not NULL ,
> [PayDate] [smalldatetime] Not NULL ,
> [Code] [varchar] (5) Not NULL ,
> [Hours] [int] Not NULL ,
> [Amount] [money] Not NULL,
> PRIMARY KEY (PayDetailID)
> ) ON [PRIMARY]
> insert EmployeePay values (1,1234,'01-01-05')
> insert EmployeePay values (1,2255,'02-03-05')
> insert EmployeePay values (2,1331,'01-01-05')
> insert EmployeePay values (2,3011,'03-06-05')
> insert EmployeePay values (2,1331,'05-06-05')
> insert PayDetail values (1,'01-01-05','5',20,200.50)
> insert PayDetail values (1,'01-01-05','6',25,10.50)
> insert PayDetail values (1,'01-01-05','13',50,120.25)
> insert PayDetail values (1,'02-03-05','5',5,110)
> insert PayDetail values (1,'02-03-05','9',18,250.50)
> insert PayDetail values (2,'01-01-05','5',50,120.25)
> insert PayDetail values (2,'01-01-05','44',10,320.32)
> insert PayDetail values (2,'01-01-05','32',50,120.25)
> insert PayDetail values (2,'03-06-05','46',18,235.75)
> insert PayDetail values (2,'05-06-05','5',-50,-120.25)
> insert PayDetail values (2,'05-06-05','44',-10,-320.32)
> insert PayDetail values (2,'05-06-05','32',-50,-120.25)
>
>
> and the script I am using:
>
> CREATE VIEW PayDetailWithRank
> AS
> SELECT pd1.ClientID, pd1.PayDate, pd1.Code, pd1.Hours, pd1.Amount,
> (SELECT COUNT(*)
> FROM PayDetail AS pd2
> WHERE pd2.ClientID = pd1.ClientID
> AND pd2.PayDate = pd1.PayDate
> AND pd2.Code < pd1.Code) AS Rank
> FROM PayDetail AS pd1
> go
> SELECT e.ClientID, e.PayDate, e.CheckNumber,
> d1.Code, d1.Hours, d1.Amount,
> d2.Code, d2.Hours, d2.Amount,
> d3.Code, d3.Hours, d3.Amount
> FROM EmployeePay AS e
> INNER JOIN PayDetailWithRank AS d1
> ON d1.ClientID = e.ClientID
> AND d1.PayDate = e.PayDate
> AND d1.Rank % 3 = 0
> LEFT JOIN PayDetailWithRank AS d2
> ON d2.ClientID = e.ClientID
> AND d2.PayDate = e.PayDate
> AND d2.Rank / 3 = d1.Rank / 3
> AND d2.Rank % 3 = 1
> LEFT JOIN PayDetailWithRank AS d3
> ON d3.ClientID = e.ClientID
> AND d3.PayDate = e.PayDate
> AND d3.Rank / 3 = d1.Rank / 3
> AND d3.Rank % 3 = 2
> ORDER BY e.ClientID, e.PayDate, d1.Rank
> go
> DROP VIEW PayDetailWithRank
> go
>
>
> This will give you 1 duplicate check. Can I change this to show only
> duplicate, somehow?
>
> Thanks,
>
> Tom
>


From: --CELKO-- on
>> had a problem with finding duplicate check numbers in the table (voids,re-issued checks etc). <<

I have a better question for you: why are you creating duplicate check
numbers? Do you think that perhaps not having a proper relational key
could be part of the problem?

If you will learn to write proper DDL, you will save a lot of complex
DML.

Also, look up the ISO-8601 format for dates and times.

From: tshad on
"--CELKO--" <jcelko212(a)earthlink.net> wrote in message
news:1126814005.385526.122020(a)z14g2000cwz.googlegroups.com...
>>> had a problem with finding duplicate check numbers in the table
>>> (voids,re-issued checks etc). <<
>
> I have a better question for you: why are you creating duplicate check
> numbers? Do you think that perhaps not having a proper relational key
> could be part of the problem?

As I had mentioned, this is not my database. I built a table to mimic that
tables I am extracting information from to create a CSV file that I can use
to import into our system. I am not creating Duplicate Check numbers - that
is what is in their system and I have to deal with it. I have over 5000
checks and as it turned out 5 duplicate check numbers. I used my group
by/having solutions to find them. As I mentioned, they were voids (hence
the duplicate check number) as well as a check that was re-issued (we don't
know why yet).

Are you suggesting I should spend my time creating a PROPER database schema,
even if it doesn't mirror what I am trying to accomplish?

And BTW, you have pounded incessantly about the evils of the IDENTITY.

I for one am grateful they used it, as you would have seen in my other post.
Hugos solution with the Views worked great, but hadn't taken into account
multiple codes on one check (which there would have been no way for him to
know this). I found that their tables used identities and that was the only
field I could replace the Code field with to make it work. It had to be
sequential and unique - Gaps would not have been an issue.

Tom
>
> If you will learn to write proper DDL, you will save a lot of complex
> DML.
>
> Also, look up the ISO-8601 format for dates and times.
>


 |  Next  |  Last
Pages: 1 2
Prev: Convert varchar to datetime
Next: Remote Query