Mass updating a database with unique key [Visual Basic]

Prev: ADO: Mass updating a database with unique key
Next: add initial MSHFLexGrid row number to adodb. recordset

From: Ralph on 31 Jan 2010 15:56

Faraz Azhar wrote:
> Hello.
>
> I have a database with quite some data in it. The DB is MS Access db.
> it has a primary key which is alphanumeric key.
>
> What im trying to accomplish is that I have a lot of data that I want
> to add to that database. Now most of my data my already exist in the
> database. Obviously if i add the data which has the same key as the
> data which already exists, it will give error that duplicate values
> are being created.
>
> I need suggestions on how to do this the fastest way: Add 2000 or 3000
> records to a ADO database, furthermore add records only if they dont
> previously exist in the database.
>
> For this i have two techniques:
>
> 1) First i may search the entire database for that unique key. If key
> exists then I wont add new record. If key doesnt exist then ill
> continue and add the record.
>
> 2) or i would just keep on adding records. ADO will give error that
> duplicate values are being created. If error arises (ie. record
> already exists in the database) then ill skip that newly added item.
> If error doesnt arise that means record was unique in the database.
>
> Which of the above two ways would you guys recommend ? Is there any
> other way round which is faster than this?

Is this a one time thing? If so you could have done it either way in the
time it took you post your question.

If this is something that needs to be done on a regular basis - then test
both options. If I had to guess I would pick door #2 - BUT why guess?

-ralph

From: Nobody on 31 Jan 2010 17:06

"Faraz Azhar" <itzfaraz(a)gmail.com> wrote in message
news:08476789-bbfb-4f5e-a8fb-353a526bbc3d(a)u15g2000prd.googlegroups.com...
> For this i have two techniques:
>
> 1) First i may search the entire database for that unique key. If key
> exists then I wont add new record. If key doesnt exist then ill
> continue and add the record.
>
> 2) or i would just keep on adding records. ADO will give error that
> duplicate values are being created. If error arises (ie. record
> already exists in the database) then ill skip that newly added item.
> If error doesnt arise that means record was unique in the database.
>
> Which of the above two ways would you guys recommend ? Is there any
> other way round which is faster than this?

With option #2, the DB engine would do #1 for you, so use option #2 unless
someone else has a better idea.

From: Ivar on 1 Feb 2010 05:53

If it's speed your looking for then my first thought would be to keep
communication with the ADO to a minimum. Run a query on the ADO to return a
single field recordset of the unique keys that already exists in the
database.Create a VB collection and add each record of the recordset to it.
Create a second VB collection. Try to add the new keys to the first
collection, if a duplicate key error happens when adding to the first
collection then add the new key to the second collection. The second
collection will then contain only the unique keys of the records to add to
the database. I'm sure this would be a lot faster than using ADO error
checking.

Ivar

From: Schmidt on 1 Feb 2010 06:17

"Ivar" <ivar.ekstromer000(a)ntlworld.com> schrieb im Newsbeitrag
news:poy9n.213182$Jg5.42490(a)newsfe02.ams2...
> If it's speed your looking for then my first thought would be to keep
> communication with the ADO to a minimum. Run a query on the ADO to return
a
> single field recordset of the unique keys that already exists in the
> database.Create a VB collection and add each record of the recordset to
it.
> Create a second VB collection. Try to add the new keys to the first
> collection, if a duplicate key error happens when adding to the first
> collection then add the new key to the second collection. The second
> collection will then contain only the unique keys of the records to add to
> the database. I'm sure this would be a lot faster than using ADO error
> checking.

Yep, that's perhaps the most efficient approach -
Transferring only the data of the "single Column Key-Select"
into a Hashtable for fast lookups of already exisiting Keys
can barely be beaten.
And the second Collection is not really needed IMO,
One can loop through the new (about to be inserted Records)
and just check all the incoming KeyValues against the
HashTable (Collection/Dictionary) for existence or not -
and act appropriately (with Inserts or not).

And additionally... "grouped inserts" (wrapped within a
transaction) work faster than "single-record-inserts/update"
sequences. In ADO this can be achieved with something like
GroupRs.UpdateBatch - or by using ADO-Command-
Objects within a Transaction.

Olaf

From: Ralph on 1 Feb 2010 08:42

Schmidt wrote:
> And the second Collection is not really needed IMO,
> One can loop through the new (about to be inserted Records)
> and just check all the incoming KeyValues against the
> HashTable (Collection/Dictionary) for existence or not -
> and act appropriately (with Inserts or not).
>

Note that Olaf mentioned "Collection" or "Dictionary". You will find the
Scripting Library's Dictionary object to be the better performer of the two.
And while you can rig your own for a VBCollection, the Dictionary already
contains an .Exists method. (It also cleans up faster.)

Reference: "Microsoft Scripting Runtime"
In code:
Dim oDict As Scripting.Dictionary
Set oDict = New Scripting.Dictionary

The only downside to a Dictionary is that it in some environments
Administrators may disable the Scripting Runtime.

-ralph

| Next | Last
Pages: 1 2
Prev: ADO: Mass updating a database with unique key
Next: add initial MSHFLexGrid row number to adodb. recordset