From: Henry on
I've just been learning SSIS to import Excel 2003 spreadsheets into SQL
Server 2005, and so far, so good.
I want to take my use of SSIS to the next level, so to speak, and could
use some help.

Here's the scenario:

I've got spreadsheet data representing names that must be consolidated.
As can happen with so many
businesses, at my workplace, there's a big effort underway to correct
and consolidate names that may
have been entered many different ways over the years, in several
different systems. Now that the systems
are being unified, we must have corrected, unique names. In some cases
they will serve as source tables
for dropdown lists, and in other cases they will serve to populate data
grids when looking at various kinds
of records.

The spreadsheet data looks something like this:


Last First MI Source Correct
Correct Correct middle
name name last name first
name initial

Albert Dave J. A
Albert David J. B Albert
David J.

Andrews Diana A
Andrews Diane B Andrews
Diane

Davies Richard A Davies
Richard
Davis Richard B

Chambers Emily F A
Chambers Emilie B Chambers
Emilie F.

Rivers Geoffrey X. A Rivers
Geoffrey E.
Rivers Jeffrey E. B


Now, I can import this data into SQL Server but if I simply use the
sheets as they are now, I'll have
lots of rows with lots of nulls. In SQL Server I could then delete the
garbage but I'd rather use SSIS
to implement a more polished approach. Here's my question (I finally
got around to it...)
How can I include a query to include only the "Correct last name",
"Correct first name" and
"Correct middle initial" from Excel rows in which "Correct last name"
and "Correct first name"
are populated.and are populated?

Yep, I know I'm being a little lazy here, asking for help in this way,
but I'm hoping that sending this
message at 9 PM on a Saturday - and certainly not during normal office
hours - will count for
something. If anyone could point me in the right direction, I'd be very
grateful.
My thanks in advance.
From: John Bell on
On Sat, 07 Aug 2010 21:08:58 -0400, Henry <greenclay(a)optonline.net>
wrote:

>I've just been learning SSIS to import Excel 2003 spreadsheets into SQL
>Server 2005, and so far, so good.
>I want to take my use of SSIS to the next level, so to speak, and could
>use some help.
>
>Here's the scenario:
>
>I've got spreadsheet data representing names that must be consolidated.
>As can happen with so many
>businesses, at my workplace, there's a big effort underway to correct
>and consolidate names that may
>have been entered many different ways over the years, in several
>different systems. Now that the systems
>are being unified, we must have corrected, unique names. In some cases
>they will serve as source tables
>for dropdown lists, and in other cases they will serve to populate data
>grids when looking at various kinds
>of records.
>
>The spreadsheet data looks something like this:
>
>
>Last First MI Source Correct
>Correct Correct middle
>name name last name first
>name initial
>
>Albert Dave J. A
>Albert David J. B Albert
>David J.
>
>Andrews Diana A
>Andrews Diane B Andrews
>Diane
>
>Davies Richard A Davies
>Richard
>Davis Richard B
>
>Chambers Emily F A
>Chambers Emilie B Chambers
>Emilie F.
>
>Rivers Geoffrey X. A Rivers
>Geoffrey E.
>Rivers Jeffrey E. B
>
>
>Now, I can import this data into SQL Server but if I simply use the
>sheets as they are now, I'll have
>lots of rows with lots of nulls. In SQL Server I could then delete the
>garbage but I'd rather use SSIS
>to implement a more polished approach. Here's my question (I finally
>got around to it...)
>How can I include a query to include only the "Correct last name",
>"Correct first name" and
>"Correct middle initial" from Excel rows in which "Correct last name"
>and "Correct first name"
>are populated.and are populated?
>
>Yep, I know I'm being a little lazy here, asking for help in this way,
>but I'm hoping that sending this
>message at 9 PM on a Saturday - and certainly not during normal office
>hours - will count for
>something. If anyone could point me in the right direction, I'd be very
>grateful.
>My thanks in advance.

You probably want a lookup transformation

Look at lesson 1 in
http://msdn.microsoft.com/en-us/library/ms169917%28SQL.90%29.aspx

If you need to do a fuzzy lookup try
http://msdn.microsoft.com/en-us/library/ms345128(SQL.90).aspx

John

From: Henry on
Thanks, John.
I'll have to look at this material, and maybe see if I can apply it at
work tomorrow, or at least in the very near future.
I'll get back to you, either way.

Thanks again.


John Bell wrote:
> On Sat, 07 Aug 2010 21:08:58 -0400, Henry <greenclay(a)optonline.net>
> wrote:
>
>
>> I've just been learning SSIS to import Excel 2003 spreadsheets into SQL
>> Server 2005, and so far, so good.
>> I want to take my use of SSIS to the next level, so to speak, and could
>> use some help.
>>
>> Here's the scenario:
>>
>> I've got spreadsheet data representing names that must be consolidated.
>> As can happen with so many
>> businesses, at my workplace, there's a big effort underway to correct
>> and consolidate names that may
>> have been entered many different ways over the years, in several
>> different systems. Now that the systems
>> are being unified, we must have corrected, unique names. In some cases
>> they will serve as source tables
>> for dropdown lists, and in other cases they will serve to populate data
>> grids when looking at various kinds
>> of records.
>>
>> The spreadsheet data looks something like this:
>>
>>
>> Last First MI Source Correct
>> Correct Correct middle
>> name name last name first
>> name initial
>>
>> Albert Dave J. A
>> Albert David J. B Albert
>> David J.
>>
>> Andrews Diana A
>> Andrews Diane B Andrews
>> Diane
>>
>> Davies Richard A Davies
>> Richard
>> Davis Richard B
>>
>> Chambers Emily F A
>> Chambers Emilie B Chambers
>> Emilie F.
>>
>> Rivers Geoffrey X. A Rivers
>> Geoffrey E.
>> Rivers Jeffrey E. B
>>
>>
>> Now, I can import this data into SQL Server but if I simply use the
>> sheets as they are now, I'll have
>> lots of rows with lots of nulls. In SQL Server I could then delete the
>> garbage but I'd rather use SSIS
>> to implement a more polished approach. Here's my question (I finally
>> got around to it...)
>> How can I include a query to include only the "Correct last name",
>> "Correct first name" and
>> "Correct middle initial" from Excel rows in which "Correct last name"
>> and "Correct first name"
>> are populated.and are populated?
>>
>> Yep, I know I'm being a little lazy here, asking for help in this way,
>> but I'm hoping that sending this
>> message at 9 PM on a Saturday - and certainly not during normal office
>> hours - will count for
>> something. If anyone could point me in the right direction, I'd be very
>> grateful.
>> My thanks in advance.
>>
>
> You probably want a lookup transformation
>
> Look at lesson 1 in
> http://msdn.microsoft.com/en-us/library/ms169917%28SQL.90%29.aspx
>
> If you need to do a fuzzy lookup try
> http://msdn.microsoft.com/en-us/library/ms345128(SQL.90).aspx
>
> John
>
>