From: Jason Barnett on
I've developed a Windows Service that is designed to perform some work every
five minutes. Within OnStart, I'm creating a worker thread that loops until
the thread is aborted (usually when the service is stopped). Within the
loop, the thread calls a function to perform the work, logs success/failure
of the work performed, then sleeps for the remaining 5 minutes.

I'm experiencing a problem where the service seems to hang; nothing is
logged but the services appears to be "Started". This problem occurs
intermittenly. The only resolution I've found thusfar, has been to cycle the
service. At one client site, the problem occurs about every few months so
its not a big deal to cycle the service. At another client site, the problem
occurs about every week or two; so cycling the service is still very
inconvenient.

Does anyone have any idea how I might troubleshoot this issue?
Alternatively, do you know of any good programming pattern examples that
might be good for me to look at; for modeling my service after?
From: Jeroen Mostert on
On 2010-05-14 20:39, Jason Barnett wrote:
> I've developed a Windows Service that is designed to perform some work every
> five minutes. Within OnStart, I'm creating a worker thread that loops until
> the thread is aborted (usually when the service is stopped). Within the
> loop, the thread calls a function to perform the work, logs success/failure
> of the work performed, then sleeps for the remaining 5 minutes.
>
> I'm experiencing a problem where the service seems to hang; nothing is
> logged but the services appears to be "Started". This problem occurs
> intermittenly. The only resolution I've found thusfar, has been to cycle the
> service. At one client site, the problem occurs about every few months so
> its not a big deal to cycle the service. At another client site, the problem
> occurs about every week or two; so cycling the service is still very
> inconvenient.
>
> Does anyone have any idea how I might troubleshoot this issue?
> Alternatively, do you know of any good programming pattern examples that
> might be good for me to look at; for modeling my service after?

It sounds like an exception that should have gone unhandled is being
handled, causing your thread to exit and the service to effectively stop
processing. It is of course also possible that your code simply contains a
logic error that trips up the loop.

A service does not stop until it's explicitly stopped or until it crashes.
In particular, it will not stop if the last thread you created exits,
because there is always at least one framework thread running still (that
listens to service events). This is why it's very important to make sure
that if your worker thread dies, it does so noisily.

General advice for making sure you're not silently failing:

- Attach an event handler to AppDomain.UnhandledException and log anything
it receives;

- Do not catch Exception, catch specific Exception subtypes instead;

- Do not use System.Timers.Timer, using System.Threading.Timer (Timers.Timer
silently swallows exceptions);

- If you're still using .NET 1.x, upgrade to 2.0. If you cannot, as a
compromise, add an outer-level exception handler to your thread that catches
Exception, logs the exception and calls Environment.Exit().

In rare circumstances it's possible for a managed thread to exit abnormally
without getting a managed exception, but this requires that the thread is
calling unmanaged code that experiences severe corruption (or that calls
ExitThread() out of order). If necessary you can set up a watchdog thread
whose only job it is to periodically check up on the worker thread with
Thread.IsAlive, but this in itself will not allow you to diagnose what
causes the thread to exit. A problem like this should be made reproducible
so you can run the service under a debugger and catch the thread exit (tools
like Process Monitor can also help with this).

That covers exceptions; it's also possible you've made a coding mistake that
causes the loop to stall or the thread to exit when you're not expecting it.
To effectively debug a service, make it so it can run as a console
application when started interactively. To do this, modify .Main() to look
like this:

public static void Main(string args[]) {
if (Environment.UserInteractive) {
// not a service
MyService s = new MyService();
s.OnStart();
Console.ReadLine();
s.OnStop();
} else {
ServiceBase.Run(new MyService());
}
}

If you can't reproduce the stalling on your own machine, try pinning the
problem down with logging. Tools like the aforementioned Process Monitor
will allow you to analyze the problem on the client sites (with a little
help from your clients).

--
J.
From: Jason Barnett on
Thanks! All of this information is very valuable to me, though I have tries
some of your advice prior to mosting my dilema.

Previously, I did find a logic error; Thread.Sleep was called with a
calculated value that could possibly have been -1 (Timeout.Infinite). I've
adjust the code to add as much logging as possible (including at the
AppDomain.UnhandledException level).

I'm going to take some time to review my code based on some of your other
suggestions and see if I can pinpoint and/or correct the problem.

"Jeroen Mostert" wrote:

> On 2010-05-14 20:39, Jason Barnett wrote:
> > I've developed a Windows Service that is designed to perform some work every
> > five minutes. Within OnStart, I'm creating a worker thread that loops until
> > the thread is aborted (usually when the service is stopped). Within the
> > loop, the thread calls a function to perform the work, logs success/failure
> > of the work performed, then sleeps for the remaining 5 minutes.
> >
> > I'm experiencing a problem where the service seems to hang; nothing is
> > logged but the services appears to be "Started". This problem occurs
> > intermittenly. The only resolution I've found thusfar, has been to cycle the
> > service. At one client site, the problem occurs about every few months so
> > its not a big deal to cycle the service. At another client site, the problem
> > occurs about every week or two; so cycling the service is still very
> > inconvenient.
> >
> > Does anyone have any idea how I might troubleshoot this issue?
> > Alternatively, do you know of any good programming pattern examples that
> > might be good for me to look at; for modeling my service after?
>
> It sounds like an exception that should have gone unhandled is being
> handled, causing your thread to exit and the service to effectively stop
> processing. It is of course also possible that your code simply contains a
> logic error that trips up the loop.
>
> A service does not stop until it's explicitly stopped or until it crashes.
> In particular, it will not stop if the last thread you created exits,
> because there is always at least one framework thread running still (that
> listens to service events). This is why it's very important to make sure
> that if your worker thread dies, it does so noisily.
>
> General advice for making sure you're not silently failing:
>
> - Attach an event handler to AppDomain.UnhandledException and log anything
> it receives;
>
> - Do not catch Exception, catch specific Exception subtypes instead;
>
> - Do not use System.Timers.Timer, using System.Threading.Timer (Timers.Timer
> silently swallows exceptions);
>
> - If you're still using .NET 1.x, upgrade to 2.0. If you cannot, as a
> compromise, add an outer-level exception handler to your thread that catches
> Exception, logs the exception and calls Environment.Exit().
>
> In rare circumstances it's possible for a managed thread to exit abnormally
> without getting a managed exception, but this requires that the thread is
> calling unmanaged code that experiences severe corruption (or that calls
> ExitThread() out of order). If necessary you can set up a watchdog thread
> whose only job it is to periodically check up on the worker thread with
> Thread.IsAlive, but this in itself will not allow you to diagnose what
> causes the thread to exit. A problem like this should be made reproducible
> so you can run the service under a debugger and catch the thread exit (tools
> like Process Monitor can also help with this).
>
> That covers exceptions; it's also possible you've made a coding mistake that
> causes the loop to stall or the thread to exit when you're not expecting it.
> To effectively debug a service, make it so it can run as a console
> application when started interactively. To do this, modify .Main() to look
> like this:
>
> public static void Main(string args[]) {
> if (Environment.UserInteractive) {
> // not a service
> MyService s = new MyService();
> s.OnStart();
> Console.ReadLine();
> s.OnStop();
> } else {
> ServiceBase.Run(new MyService());
> }
> }
>
> If you can't reproduce the stalling on your own machine, try pinning the
> problem down with logging. Tools like the aforementioned Process Monitor
> will allow you to analyze the problem on the client sites (with a little
> help from your clients).
>
> --
> J.
> .
>