From: mgdev on
I need some help on how to troubleshoot IIS Randomly Hanging on my server.

Late last year we upgraded our ASP.net (2.0) application with a lot of new
features. Now that we've rolled this new version out to many of our
customers we're getting complaints about IIS hanging and needing to be
restarted. Unfortunately, this hanging is random. After the restart of IIS
the application works as expected. We are unable to predictably reproduce
this hanging situation. But at some point, anywhere from an hour to a few
days, the w3wp.exe will hang again and prevent the system from being used. I
have observed this first hand, but as I stated above, after restarting IIS I
can't reproduce the problem.

We have seen this problem in both IIS6 (Server 2003) and IIS7 (Server 2008).

If you have any suggestions on how I can go about troubleshooting or
resolving this issue I'd like to know. Please know that this is an urgent
matter so if there are any options I can try that would minimize this problem
for the short term I'd like to explore that as well.
From: Chris M on
On 09/04/2010 15:54, mgdev wrote:
> I need some help on how to troubleshoot IIS Randomly Hanging on my server.
>
> Late last year we upgraded our ASP.net (2.0) application with a lot of new
> features. Now that we've rolled this new version out to many of our
> customers we're getting complaints about IIS hanging and needing to be
> restarted. Unfortunately, this hanging is random. After the restart of IIS
> the application works as expected. We are unable to predictably reproduce
> this hanging situation. But at some point, anywhere from an hour to a few
> days, the w3wp.exe will hang again and prevent the system from being used. I
> have observed this first hand, but as I stated above, after restarting IIS I
> can't reproduce the problem.
>
> We have seen this problem in both IIS6 (Server 2003) and IIS7 (Server 2008).
>
> If you have any suggestions on how I can go about troubleshooting or
> resolving this issue I'd like to know. Please know that this is an urgent
> matter so if there are any options I can try that would minimize this problem
> for the short term I'd like to explore that as well.

As a starting point, install DebugDiag and use the Hang rule to generate
a dump file when the site stops responding. Use the Crash/Hang analysis
to see if that gives you any useful information about what's happening
when the worker process hangs.


--
Chris M.
From: mgdev on
Unfortunately, this is a difficult problem to reproduce. But after leaving
it running for a while it looks like the "Hang Rule" did catch a couple of
instances.

My problem is that I don't know how to properly interprate the results. I
guess the amount of text in the analysis page is too large to post here. I
have put the info on the thread that consumed the most cpu time below.
Please review and let me know if there is somthing specific I should be
looking for.

Thread 24 - System ID 5008
Entry point mscorwks!Thread::intermediateThreadProc
Create time 4/9/2010 9:29:40 AM
Time spent in user mode 0 Days 00:05:47.031
Time spent in kernel mode 0 Days 00:00:00.359




This thread is not fully resolved and may or may not be a problem. Further
analysis of these threads may be required.



Function Source
mscorwks!StoreEventToEventStore+21
mscorwks!EECodeManager::GetGSCookieAddr+2c
mscorwks!Thread::StackWalkFramesEx+34e
mscorwks!Thread::ReadyForAsyncException+2d3
mscorwks!Thread::HandleThreadAbort+80
mscorwks!COMNlsInfo::IndexOfString+1fa
mscorlib_ni+20ebcd
mscorlib_ni+1f03fb
Microsoft_VisualBasic_ni+ea29c
Microsoft_VisualBasic_ni+ea475
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
mscorwks!JIT_MonEnterWorker_Portable+1c
0x1f9fc7a5
0x1f9fc0be
0x1c7963a3
mscorwks!COMDelegate::GetInvokeMethod+20
0x02b3af64
0x0eb80778
0x02b6c93c
0x0e7628c8