From: DB on
Every few days, our SQL 2008 server comes to a crawl and we have to reboot
it. The CPU becomes pegged at 100%. I ran a query to roll-up my wait type
counts. The top 3 wait types are SOS_SCHEDULER_YIELD, CXPACKET, and
ASYNC_NETWORK_IO. Their wait time in seconds are 828426, 183026, 156414
respectively. Does anybody have any idea as to what is going on? This is
killing me!

Thanks,

--
DB
From: Erland Sommarskog on
DB (DB(a)discussions.microsoft.com) writes:
> Every few days, our SQL 2008 server comes to a crawl and we have to
> reboot it. The CPU becomes pegged at 100%. I ran a query to roll-up my
> wait type counts. The top 3 wait types are SOS_SCHEDULER_YIELD,
> CXPACKET, and ASYNC_NETWORK_IO. Their wait time in seconds are 828426,
> 183026, 156414 respectively. Does anybody have any idea as to what is
> going on? This is killing me!

Of course we have no idea what is going since you have not performed
very much diagnosis. But presumably, there is a query from hell driving
your system over the edge.

You mention three wait types. SOS_SCHEDULER_YIELD is when a task yields
to let another task execute. This may indicate that processes are starved
for CPU, but I believe it is common for this type to be at the top.

CXPACKET pertains to parallelism; a CXPACKET wait is when a thread
waits for other threads to complete. It is not unusual to see high
CXPACKET waits, but it may indicate that parallelism is not working
well for you. Maybe you have parallel queries where one thread gets
all the work.

ASYNC_NETWORK_IO has no relation to CPU load at all. Rather high
ASYNC_NETWORK_IO waits may reduce the CPU load, because this wait
state occurs when a client does not pick up data as fast as SQL Server
feeds it.

What you need to do is to run more detailed diagnostics when the server
gets the fever. The most important is to find out which queries that
are running on the server. On
http://www.sommarskog.se/sqlutil/beta_lockinfo.html you find a stored
procedure that gives the currently running statements, their query
plans, and locks held. (The latter is likely to be of less interest
right now.) This can help you to find the culprit. Beware that the
query plans are only the estimated plans, so the plans may look good,
when they in fact are a disaster.

For further discussion, what type of server is this: OLTP? OLAP? A
consolidated server with lots of applications? One a one-app server?
Test? Development? Production?

--
Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

From: Jeffrey Williams on
Are you by chance running SQL Server x64? If so, have you set the max
memory setting for SQL Server?

"Erland Sommarskog" <esquel(a)sommarskog.se> wrote in message
news:Xns9D3CEDE6B1610Yazorman(a)127.0.0.1...
> DB (DB(a)discussions.microsoft.com) writes:
>> Every few days, our SQL 2008 server comes to a crawl and we have to
>> reboot it. The CPU becomes pegged at 100%. I ran a query to roll-up my
>> wait type counts. The top 3 wait types are SOS_SCHEDULER_YIELD,
>> CXPACKET, and ASYNC_NETWORK_IO. Their wait time in seconds are 828426,
>> 183026, 156414 respectively. Does anybody have any idea as to what is
>> going on? This is killing me!
>
> Of course we have no idea what is going since you have not performed
> very much diagnosis. But presumably, there is a query from hell driving
> your system over the edge.
>
> You mention three wait types. SOS_SCHEDULER_YIELD is when a task yields
> to let another task execute. This may indicate that processes are starved
> for CPU, but I believe it is common for this type to be at the top.
>
> CXPACKET pertains to parallelism; a CXPACKET wait is when a thread
> waits for other threads to complete. It is not unusual to see high
> CXPACKET waits, but it may indicate that parallelism is not working
> well for you. Maybe you have parallel queries where one thread gets
> all the work.
>
> ASYNC_NETWORK_IO has no relation to CPU load at all. Rather high
> ASYNC_NETWORK_IO waits may reduce the CPU load, because this wait
> state occurs when a client does not pick up data as fast as SQL Server
> feeds it.
>
> What you need to do is to run more detailed diagnostics when the server
> gets the fever. The most important is to find out which queries that
> are running on the server. On
> http://www.sommarskog.se/sqlutil/beta_lockinfo.html you find a stored
> procedure that gives the currently running statements, their query
> plans, and locks held. (The latter is likely to be of less interest
> right now.) This can help you to find the culprit. Beware that the
> query plans are only the estimated plans, so the plans may look good,
> when they in fact are a disaster.
>
> For further discussion, what type of server is this: OLTP? OLAP? A
> consolidated server with lots of applications? One a one-app server?
> Test? Development? Production?
>
> --
> Erland Sommarskog, SQL Server MVP, esquel(a)sommarskog.se
>
> Links for SQL Server Books Online:
> SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
> SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
> SQL 2000:
> http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
>
From: Uri Dimant on
db
---This first thing to check if CPU is at 100% is to look for parallel
queries:

-- Tasks running in parallel (filtering out MARS requests below):

select * from sys.dm_os_tasks as t

where t.session_id in (

select t1.session_id

from sys.dm_os_tasks as t1

group by t1.session_id

having count(*) > 1

and min(t1.request_id) = max(t1.request_id));

-- Requests running in parallel:

select *

from sys.dm_exec_requests as r

join (

select t1.session_id, min(t1.request_id)

from sys.dm_os_tasks as t1

group by t1.session_id

having count(*) > 1

and min(t1.request_id) = max(t1.request_id)

) as t(session_id, request_id)

on r.session_id = t.session_id

and r.request_id = t.request_id;





"DB" <DB(a)discussions.microsoft.com> wrote in message
news:885879B7-BB05-42D4-8B69-E3BE76E7CFD7(a)microsoft.com...
> Every few days, our SQL 2008 server comes to a crawl and we have to reboot
> it. The CPU becomes pegged at 100%. I ran a query to roll-up my wait
> type
> counts. The top 3 wait types are SOS_SCHEDULER_YIELD, CXPACKET, and
> ASYNC_NETWORK_IO. Their wait time in seconds are 828426, 183026, 156414
> respectively. Does anybody have any idea as to what is going on? This is
> killing me!
>
> Thanks,
>
> --
> DB