C# and SQL Server: Parallel vs. Sequential Writes (?) [Programming SQL]

Prev: PHP Tutorials
Next: Reservered Words

From: Kimbo Slice on 18 Feb 2010 21:00

Hi Jay,

I ran the query. It returned the following results:

max degree of parallelism 0 64 0 0

"Jay" <spam(a)nospam.org> wrote in message
news:u920RIOsKHA.6140(a)TK2MSFTNGP05.phx.gbl...
>I don't know that inserts can be run in parallel to a non-partitioned
>table, nor am I sure that you would want to. I'll leave that to the MVP's.
>
> Can you run (or have a DBA run) the following and post the result?
>
> -- SQL Server 2008, but I think 2005 as well.
>
> exec sp_configure 'show advanced options', 1;
>
> reconfigure;
>
> exec sp_configure 'max degree of parallelism';
>
> exec sp_configure 'show advanced options', 0;
>
> reconfigure;
>
>
>
>
>
> "Kimbo Slice" <Jules.Winfield(a)newsgroup.nospam> wrote in message
> news:Wb2dncR3FoAqMeDWnZ2dnUVZ_jidnZ2d(a)giganews.com...
>>
>> BULK INSERT seems to be a common suggestion. I don't know too much about
>> bulk inserts but a quick glance at the docs seems to indicate that this
>> is used to upload data from a pre-existing text or CSV file. I'm not
>> convinced that this is the appropriate approach because then I'd have the
>> additional overhead of trying to write everything out to a file, copying
>> the file to SQL Server, telling SQL Server to read it back in, and then
>> upload the data. I'm thinking it would be faster to simply upload the
>> data from client to server.
>>
>> The data doesn't originate from a file. It's created by a C# application,
>> sits around in the memory space of the C# app for a few hours, and then
>> later is optionally uploaded to the database.
>>
>> What triggered my question was the fact that when uploading 8,000 records
>> from the C# app to the database, one of the database's eight CPUs gets
>> pegged at 100%. It occured to me that by splitting the upload up into
>> eight batches of 1,000 records, and executing the INSERT commands on
>> eight different threads, it might allow SQL Server to make more
>> effectively use all eight of it's CPUs... or maybe not.
>>
>> Just a thought.
>>
>> Thanks again for the response.
>>
>> David
>>
>> "Kimbo Slice" <Jules.Winfield(a)newsgroup.nospam> wrote in message
>> news:7a-dnVmcZ6IkEuHWnZ2dnUVZ_gydnZ2d(a)giganews.com...
>>> Consider:
>>>
>>> - My SQL Server has 8 processors (cores actually).
>>> - I want to upload 8,000 records to a single table in my database.
>>>
>>> Which of these two would get the job done faster?
>>>
>>> (a) spawn 8 different threads in my client application and, using
>>> eight different sql connections, upload 1,000 entries on each thread, or
>>>
>>> (b) upload all 8,000 entries from the same thread.
>>>
>>> Option (a) would seem to make more effective use of all eight processors
>>> on the SQL Server. But in the final analysis, won't the SQL Server
>>> bottleneck be the transaction log where it has to sequentially serialize
>>> all of this stuff to disk? Seems like (a) wouldn't provide much of an
>>> advantage, would it? In fact, wouldn't it actually be a DISadvantage
>>> because of the additional connections required?
>>>
>>> Thanks...
>>>
>>
>>
>
>

From: Jay on 18 Feb 2010 23:22

> max degree of parallelism 0 64 0 0

Those are the default values and the QO would try to use all of the
processors - dry hole.

I keep thinking about what's involved in a large number of inserts to a
non-partitioned table and wondering if multi-threading would even do any
good, I have my doubts.

Baring the MVP's coming up with something and your restrictions all I can
think of is to issue a table lock (see TABLOCK hint to INSERT), the other
OPENROWSET options don't seem to apply. None of that deals with the observed
processor affinity - which may not be wrong.

However, since your description involves the client machine holding onto the
data for a while, I would seriously consider writing the data to a file and
having a backend SSIS package do the load. That, or write the data more
frequently.

Another thought would be to create a work table and insert your rows as they
come in. Then run a batch process (SSIS, or T-SQL) on the server that does a
merge into the final table.

Just remember, bulk data loads were never intended to be done from a client,
they were intended to be done from the server.

"Kimbo Slice" <Jules.Winfield(a)newsgroup.nospam> wrote in message
news:f8qdnf2oaoJZbeDWnZ2dnUVZ_vudnZ2d(a)giganews.com...
> Hi Jay,
>
> I ran the query. It returned the following results:
>
> max degree of parallelism 0 64 0 0
>
> "Jay" <spam(a)nospam.org> wrote in message
> news:u920RIOsKHA.6140(a)TK2MSFTNGP05.phx.gbl...
>>I don't know that inserts can be run in parallel to a non-partitioned
>>table, nor am I sure that you would want to. I'll leave that to the MVP's.
>>
>> Can you run (or have a DBA run) the following and post the result?
>>
>> -- SQL Server 2008, but I think 2005 as well.
>>
>> exec sp_configure 'show advanced options', 1;
>>
>> reconfigure;
>>
>> exec sp_configure 'max degree of parallelism';
>>
>> exec sp_configure 'show advanced options', 0;
>>
>> reconfigure;
>>
>>
>>
>>
>>
>> "Kimbo Slice" <Jules.Winfield(a)newsgroup.nospam> wrote in message
>> news:Wb2dncR3FoAqMeDWnZ2dnUVZ_jidnZ2d(a)giganews.com...
>>>
>>> BULK INSERT seems to be a common suggestion. I don't know too much about
>>> bulk inserts but a quick glance at the docs seems to indicate that this
>>> is used to upload data from a pre-existing text or CSV file. I'm not
>>> convinced that this is the appropriate approach because then I'd have
>>> the additional overhead of trying to write everything out to a file,
>>> copying the file to SQL Server, telling SQL Server to read it back in,
>>> and then upload the data. I'm thinking it would be faster to simply
>>> upload the data from client to server.
>>>
>>> The data doesn't originate from a file. It's created by a C#
>>> application, sits around in the memory space of the C# app for a few
>>> hours, and then later is optionally uploaded to the database.
>>>
>>> What triggered my question was the fact that when uploading 8,000
>>> records from the C# app to the database, one of the database's eight
>>> CPUs gets pegged at 100%. It occured to me that by splitting the upload
>>> up into eight batches of 1,000 records, and executing the INSERT
>>> commands on eight different threads, it might allow SQL Server to make
>>> more effectively use all eight of it's CPUs... or maybe not.
>>>
>>> Just a thought.
>>>
>>> Thanks again for the response.
>>>
>>> David
>>>
>>> "Kimbo Slice" <Jules.Winfield(a)newsgroup.nospam> wrote in message
>>> news:7a-dnVmcZ6IkEuHWnZ2dnUVZ_gydnZ2d(a)giganews.com...
>>>> Consider:
>>>>
>>>> - My SQL Server has 8 processors (cores actually).
>>>> - I want to upload 8,000 records to a single table in my database.
>>>>
>>>> Which of these two would get the job done faster?
>>>>
>>>> (a) spawn 8 different threads in my client application and, using
>>>> eight different sql connections, upload 1,000 entries on each thread,
>>>> or
>>>>
>>>> (b) upload all 8,000 entries from the same thread.
>>>>
>>>> Option (a) would seem to make more effective use of all eight
>>>> processors on the SQL Server. But in the final analysis, won't the SQL
>>>> Server bottleneck be the transaction log where it has to sequentially
>>>> serialize all of this stuff to disk? Seems like (a) wouldn't provide
>>>> much of an advantage, would it? In fact, wouldn't it actually be a
>>>> DISadvantage because of the additional connections required?
>>>>
>>>> Thanks...
>>>>
>>>
>>>
>>
>>
>
>

From: Kimbo Slice on 19 Feb 2010 08:52

Hi Jay,

Thanks for the suggestions. Excellent food for thought.

David

"Jay" <spam(a)nospam.org> wrote in message
news:OImRysRsKHA.1796(a)TK2MSFTNGP02.phx.gbl...
>> max degree of parallelism 0 64 0 0
>
> Those are the default values and the QO would try to use all of the
> processors - dry hole.
>
> I keep thinking about what's involved in a large number of inserts to a
> non-partitioned table and wondering if multi-threading would even do any
> good, I have my doubts.
>
> Baring the MVP's coming up with something and your restrictions all I can
> think of is to issue a table lock (see TABLOCK hint to INSERT), the other
> OPENROWSET options don't seem to apply. None of that deals with the
> observed processor affinity - which may not be wrong.
>
> However, since your description involves the client machine holding onto
> the data for a while, I would seriously consider writing the data to a
> file and having a backend SSIS package do the load. That, or write the
> data more frequently.
>
> Another thought would be to create a work table and insert your rows as
> they come in. Then run a batch process (SSIS, or T-SQL) on the server that
> does a merge into the final table.
>
> Just remember, bulk data loads were never intended to be done from a
> client, they were intended to be done from the server.
>
> "Kimbo Slice" <Jules.Winfield(a)newsgroup.nospam> wrote in message
> news:f8qdnf2oaoJZbeDWnZ2dnUVZ_vudnZ2d(a)giganews.com...
>> Hi Jay,
>>
>> I ran the query. It returned the following results:
>>
>> max degree of parallelism 0 64 0 0
>>
>> "Jay" <spam(a)nospam.org> wrote in message
>> news:u920RIOsKHA.6140(a)TK2MSFTNGP05.phx.gbl...
>>>I don't know that inserts can be run in parallel to a non-partitioned
>>>table, nor am I sure that you would want to. I'll leave that to the
>>>MVP's.
>>>
>>> Can you run (or have a DBA run) the following and post the result?
>>>
>>> -- SQL Server 2008, but I think 2005 as well.
>>>
>>> exec sp_configure 'show advanced options', 1;
>>>
>>> reconfigure;
>>>
>>> exec sp_configure 'max degree of parallelism';
>>>
>>> exec sp_configure 'show advanced options', 0;
>>>
>>> reconfigure;
>>>
>>>
>>>
>>>
>>>
>>> "Kimbo Slice" <Jules.Winfield(a)newsgroup.nospam> wrote in message
>>> news:Wb2dncR3FoAqMeDWnZ2dnUVZ_jidnZ2d(a)giganews.com...
>>>>
>>>> BULK INSERT seems to be a common suggestion. I don't know too much
>>>> about bulk inserts but a quick glance at the docs seems to indicate
>>>> that this is used to upload data from a pre-existing text or CSV file.
>>>> I'm not convinced that this is the appropriate approach because then
>>>> I'd have the additional overhead of trying to write everything out to a
>>>> file, copying the file to SQL Server, telling SQL Server to read it
>>>> back in, and then upload the data. I'm thinking it would be faster to
>>>> simply upload the data from client to server.
>>>>
>>>> The data doesn't originate from a file. It's created by a C#
>>>> application, sits around in the memory space of the C# app for a few
>>>> hours, and then later is optionally uploaded to the database.
>>>>
>>>> What triggered my question was the fact that when uploading 8,000
>>>> records from the C# app to the database, one of the database's eight
>>>> CPUs gets pegged at 100%. It occured to me that by splitting the upload
>>>> up into eight batches of 1,000 records, and executing the INSERT
>>>> commands on eight different threads, it might allow SQL Server to make
>>>> more effectively use all eight of it's CPUs... or maybe not.
>>>>
>>>> Just a thought.
>>>>
>>>> Thanks again for the response.
>>>>
>>>> David
>>>>
>>>> "Kimbo Slice" <Jules.Winfield(a)newsgroup.nospam> wrote in message
>>>> news:7a-dnVmcZ6IkEuHWnZ2dnUVZ_gydnZ2d(a)giganews.com...
>>>>> Consider:
>>>>>
>>>>> - My SQL Server has 8 processors (cores actually).
>>>>> - I want to upload 8,000 records to a single table in my database.
>>>>>
>>>>> Which of these two would get the job done faster?
>>>>>
>>>>> (a) spawn 8 different threads in my client application and, using
>>>>> eight different sql connections, upload 1,000 entries on each thread,
>>>>> or
>>>>>
>>>>> (b) upload all 8,000 entries from the same thread.
>>>>>
>>>>> Option (a) would seem to make more effective use of all eight
>>>>> processors on the SQL Server. But in the final analysis, won't the SQL
>>>>> Server bottleneck be the transaction log where it has to sequentially
>>>>> serialize all of this stuff to disk? Seems like (a) wouldn't provide
>>>>> much of an advantage, would it? In fact, wouldn't it actually be a
>>>>> DISadvantage because of the additional connections required?
>>>>>
>>>>> Thanks...
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

From: Andy O'Neill on 19 Feb 2010 11:26

"Kimbo Slice" <Jules.Winfield(a)newsgroup.nospam> wrote in message
news:A7qdnYv3c4AtCuPWnZ2dnUVZ_qGdnZ2d(a)giganews.com...
> Hi Jay,
>
> Thanks for the suggestions. Excellent food for thought.
>
> David
>
> "Jay" <spam(a)nospam.org> wrote in message
> news:OImRysRsKHA.1796(a)TK2MSFTNGP02.phx.gbl...
>>> max degree of parallelism 0 64 0 0
>>
>> Those are the default values and the QO would try to use all of the
>> processors - dry hole.
>>
>> I keep thinking about what's involved in a large number of inserts to a
>> non-partitioned table and wondering if multi-threading would even do any
>> good, I have my doubts.
>>
>> Baring the MVP's coming up with something and your restrictions all I can
>> think of is to issue a table lock (see TABLOCK hint to INSERT), the other
>> OPENROWSET options don't seem to apply. None of that deals with the
>> observed processor affinity - which may not be wrong.
>>
>> However, since your description involves the client machine holding onto
>> the data for a while, I would seriously consider writing the data to a
>> file and having a backend SSIS package do the load. That, or write the
>> data more frequently.
>>
>> Another thought would be to create a work table and insert your rows as
>> they come in. Then run a batch process (SSIS, or T-SQL) on the server
>> that does a merge into the final table.
>>
>> Just remember, bulk data loads were never intended to be done from a
>> client, they were intended to be done from the server.

Or write to a local sql compact database.
Or write changes to a message queue and use a service to lazy write them to
the holding table on the server.

Depending on WHY there are 8000 changed records sitting around in memory on
the client machine.
That's a busy individual if he's sitting there typing them in.

I wonder if we're still missing a relevent piece of the puzzle.
What creates those 8000 records?

From: Jay on 19 Feb 2010 11:41

Now these are excellent suggestions and questions!

> Or write to a local sql compact database.
> Or write changes to a message queue and use a service to lazy write them
> to the holding table on the server.
>
> Depending on WHY there are 8000 changed records sitting around in memory
> on the client machine.
> That's a busy individual if he's sitting there typing them in.
>
> I wonder if we're still missing a relevent piece of the puzzle.
> What creates those 8000 records?

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: PHP Tutorials
Next: Reservered Words