From: bcap on
Hi,

I have a table that includes the following fields:
- ID
- MatchCode
- FirstName
- LastName
- City
- Zip

Data has come from multiple sources so there may be duplicate
matchcode's but the other fields may slightly vary.

I would like to get a list of all distinct matchcodes and if there are
multiple matchcodes that are the same, I'd like to take the top on
only.

Is this possible? Any advice would greatly be appreciated!

Thank you for any help in advance!



From: Uri Dimant on
bcap
What datatype is MatchCode column? VARCHAR,NVARCHAR.INT.....

It seems that you store as VARCHAR column, then I would advise to do
soemthing like that
Adding MatchCode_Hash colum in every table as

MatchCode_Hash AS HashBytes('SHA1', MatchCode)



"bcap" <rayh(a)patriots.com> wrote in message
news:825b4c91-d1fb-4fe3-8184-7a459cb122f2(a)x9g2000vbo.googlegroups.com...
> Hi,
>
> I have a table that includes the following fields:
> - ID
> - MatchCode
> - FirstName
> - LastName
> - City
> - Zip
>
> Data has come from multiple sources so there may be duplicate
> matchcode's but the other fields may slightly vary.
>
> I would like to get a list of all distinct matchcodes and if there are
> multiple matchcodes that are the same, I'd like to take the top on
> only.
>
> Is this possible? Any advice would greatly be appreciated!
>
> Thank you for any help in advance!
>
>
>


From: Plamen Ratchev on
If you only need the unique MatchCodes then you can do this:

SELECT DISTINCT MatchCode
FROM MyTable;

To get only one row per unique MatchCode you can use this (not it picks one in no particular order, use the ORDER BY
clause in OVER to place a column that will provide the desired order):

SELECT ID, MatchCode, FirstName, LastName, City, Zip
FROM (
SELECT ID, MatchCode, FirstName, LastName, City, Zip,
ROW_NUMBER() OVER(PARTITION BY MatchCode ORDER BY (SELECT NULL)) AS rk
FROM MyTable) AS T
WHERE rk = 1;

--
Plamen Ratchev
http://www.SQLStudio.com
From: --CELKO-- on
"A problem well stated is a problem half solved." -- Charles F.
Kettering

Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, data types, etc. in
your schema are. If you know how, follow ISO-11179 data element naming
conventions and formatting rules. Temporal data should use ISO-8601
formats. Code should be in Standard SQL as much as possible and not
local dialect.

Sample data is also a good idea, along with clear specifications. It
is very hard to debug code when you do not let us see it. If you want
to learn how to ask a question on a Newsgroup, look at:
http://www.catb.org/~esr/faqs/smart-questions.html

If you don't know anything about RDBMS, then get a copy of the
simplest intro book I know --
http://www.amazon.com/Manga-Guide-Databases-Mana-Takahashi/dp/1593271905

This might be a good idea; you confused fields and columns, have vague
names (I guess you meant to use city_name and zip_code), no
constriants to prevent the dirty data you are conplaining about. But
the use of id by itself impies a non-relational design that mimics a
sequential file's physical record number.

>> I would like to get a list of all distinct matchcodes and if there are multiple matchcodes that are the same, I'd like to take the top on only. <<

Since tables (unlike files) have no ordering by definition, what does
"top" mean in relational terms?

From: Tony Rogerson on
> Since tables (unlike files) have no ordering by definition, what does
> "top" mean in relational terms?
>

Since TOP requires ORDER BY to make any sense then your argument is correct
that use of TOP in "relational theory" terms breaks relational theory.

However, SQL basterdises relational theory all over the place.

I don't here you complaining about duplicate rows returned from intermediate
results, I don't hear you complaining about unnamed columns, I don't see you
writing code using Tutorial D.

Relational theory is distinct from SQL; SQL is just a sub-language that
tries to implement (and often badly) relational theory.

So - your point as ever is out of context and just plain wrong, you've tried
to apply an argument against something you hate - distractions from ISO SQL.

I think what you should say is that "what does 'top' mean in SQL terms?" and
the answer to that is - nothing - its fine; it allows us to implement real
solutions for real business problems like TOP 10 products which isn't
possible otherwise without the use of an external program; bit silly get the
TOP 10 rows of a 1 billion row table when you have to pull all 1 billion
rows into an external program.

In relational theory terms the ORDER BY breaks the table and it is a table
no more.

--ROGGIE--


"--CELKO--" <jcelko212(a)earthlink.net> wrote in message
news:eb0d3bc7-fa7c-4e94-84bc-abbac0ffaf21(a)z41g2000yqz.googlegroups.com...
> "A problem well stated is a problem half solved." -- Charles F.
> Kettering
>
> Please post DDL, so that people do not have to guess what the keys,
> constraints, Declarative Referential Integrity, data types, etc. in
> your schema are. If you know how, follow ISO-11179 data element naming
> conventions and formatting rules. Temporal data should use ISO-8601
> formats. Code should be in Standard SQL as much as possible and not
> local dialect.
>
> Sample data is also a good idea, along with clear specifications. It
> is very hard to debug code when you do not let us see it. If you want
> to learn how to ask a question on a Newsgroup, look at:
> http://www.catb.org/~esr/faqs/smart-questions.html
>
> If you don't know anything about RDBMS, then get a copy of the
> simplest intro book I know --
> http://www.amazon.com/Manga-Guide-Databases-Mana-Takahashi/dp/1593271905
>
> This might be a good idea; you confused fields and columns, have vague
> names (I guess you meant to use city_name and zip_code), no
> constriants to prevent the dirty data you are conplaining about. But
> the use of id by itself impies a non-relational design that mimics a
> sequential file's physical record number.
>
>>> I would like to get a list of all distinct matchcodes and if there are
>>> multiple matchcodes that are the same, I'd like to take the top on only.
>>> <<
>
> Since tables (unlike files) have no ordering by definition, what does
> "top" mean in relational terms?
>