From: Andrew Morton on
On Mon, 25 Jan 2010 22:26:39 -0600 Jason Wessel <jason.wessel(a)windriver.com> wrote:

> This is a regression fix against: 0f8e8ef7c204988246da5a42d576b7fa5277a8e4

It's conventional to quote the patch title as well as the hash. ie:

0f8e8ef7c204988246da5a42d576b7fa5277a8e4 ("clocksource: Simplify
clocksource watchdog resume logic")

> Spin locks were added to the clocksource_resume_watchdog() which cause
> the kernel debugger to deadlock on an SMP system frequently.

Please fully describe the deadlock. Without that analysis, the only
way we can work it out is by guessing. This makes it hard for others to
suggest alternative fixes.

> The kernel debugger can try for the lock, but if it fails it should
> continue to touch the clocksource watchdog anyway, else it will trip
> if the general kernel execution has been paused for too long.
>
> This introduces an possible race condition where the kernel debugger
> might not process the list correctly if a clocksource is being added
> or removed at the time of this call. This race is sufficiently rare vs
> having the kernel debugger hang the kernel

A trylock is a pretty ugly "solution" to a locking bug.

> CC: Thomas Gleixner <tglx(a)linutronix.de>
> CC: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
> CC: John Stultz <johnstul(a)us.ibm.com>
> CC: Andrew Morton <akpm(a)linux-foundation.org>
> CC: Magnus Damm <damm(a)igel.co.jp>
> Signed-off-by: Jason Wessel <jason.wessel(a)windriver.com>
> ---
> kernel/time/clocksource.c | 7 ++++++-
> 1 files changed, 6 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index e85c234..74f9ba6 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -463,7 +463,12 @@ void clocksource_resume(void)
> */
> void clocksource_touch_watchdog(void)
> {
> - clocksource_resume_watchdog();
> + unsigned long flags;
> +
> + int got_lock = spin_trylock_irqsave(&watchdog_lock, flags);
> + clocksource_reset_watchdog();
> + if (got_lock)
> + spin_unlock_irqrestore(&watchdog_lock, flags);
> }

If we're going to do this then clocksource_reset_watchdog() should be
uninlined. It shouldn't have been inlined in the first place.

This trylock should be accompanied with an explanation which fully
describes the reasons for its presence. Without that, how can the
code reader work this out?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Martin Schwidefsky on
On Mon, 25 Jan 2010 22:26:39 -0600
Jason Wessel <jason.wessel(a)windriver.com> wrote:

> This is a regression fix against: 0f8e8ef7c204988246da5a42d576b7fa5277a8e4
>
> Spin locks were added to the clocksource_resume_watchdog() which cause
> the kernel debugger to deadlock on an SMP system frequently.
>
> The kernel debugger can try for the lock, but if it fails it should
> continue to touch the clocksource watchdog anyway, else it will trip
> if the general kernel execution has been paused for too long.
>
> This introduces an possible race condition where the kernel debugger
> might not process the list correctly if a clocksource is being added
> or removed at the time of this call. This race is sufficiently rare vs
> having the kernel debugger hang the kernel
>
> CC: Thomas Gleixner <tglx(a)linutronix.de>
> CC: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
> CC: John Stultz <johnstul(a)us.ibm.com>
> CC: Andrew Morton <akpm(a)linux-foundation.org>
> CC: Magnus Damm <damm(a)igel.co.jp>
> Signed-off-by: Jason Wessel <jason.wessel(a)windriver.com>

The first question I would ask is why does the kernel deadlock? Can we
have a backchain of a deadlock please?

Hmm, there are all kinds of races if the watchdog code gets interrupted
by the kernel debugger. Wouldn't it be better to just disable the
watchdog while the kernel debugger is active?

--
blue skies,
Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Thomas Gleixner on
On Mon, 25 Jan 2010, Jason Wessel wrote:
> This is a regression fix against: 0f8e8ef7c204988246da5a42d576b7fa5277a8e4
>
> Spin locks were added to the clocksource_resume_watchdog() which cause
> the kernel debugger to deadlock on an SMP system frequently.
>
> The kernel debugger can try for the lock, but if it fails it should
> continue to touch the clocksource watchdog anyway, else it will trip
> if the general kernel execution has been paused for too long.
>
> This introduces an possible race condition where the kernel debugger
> might not process the list correctly if a clocksource is being added
> or removed at the time of this call. This race is sufficiently rare vs
> having the kernel debugger hang the kernel

I'm not really excited happy about adding a race condition :)

If you stop the kernel in the middle of the watchdog code
(i.e. watchdog_lock is held) then clocksource_reset_watchdog() is not
really a guarantee to keep the TSC alive.

> void clocksource_touch_watchdog(void)
> {
> - clocksource_resume_watchdog();
> + unsigned long flags;
> +
> + int got_lock = spin_trylock_irqsave(&watchdog_lock, flags);

So I prefer

if (!spin_trylock_irqsave(&watchdog_lock, flags))
return;

If that results in TSC being marked unstable then that is way better
than having a race which might even crash or lock the machine when the
stop happened in the middle of a list_add().

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Thomas Gleixner on
On Tue, 26 Jan 2010, Martin Schwidefsky wrote:
> On Mon, 25 Jan 2010 22:26:39 -0600
> Jason Wessel <jason.wessel(a)windriver.com> wrote:
>
> > This is a regression fix against: 0f8e8ef7c204988246da5a42d576b7fa5277a8e4
> >
> > Spin locks were added to the clocksource_resume_watchdog() which cause
> > the kernel debugger to deadlock on an SMP system frequently.
> >
> > The kernel debugger can try for the lock, but if it fails it should
> > continue to touch the clocksource watchdog anyway, else it will trip
> > if the general kernel execution has been paused for too long.
> >
> > This introduces an possible race condition where the kernel debugger
> > might not process the list correctly if a clocksource is being added
> > or removed at the time of this call. This race is sufficiently rare vs
> > having the kernel debugger hang the kernel
> >
> > CC: Thomas Gleixner <tglx(a)linutronix.de>
> > CC: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
> > CC: John Stultz <johnstul(a)us.ibm.com>
> > CC: Andrew Morton <akpm(a)linux-foundation.org>
> > CC: Magnus Damm <damm(a)igel.co.jp>
> > Signed-off-by: Jason Wessel <jason.wessel(a)windriver.com>
>
> The first question I would ask is why does the kernel deadlock? Can we
> have a backchain of a deadlock please?

The problem arises when the kernel is stopped inside the watchdog code
with watchdog_lock held. When kgdb restarts execution then it touches
the watchdog to avoid that TSC gets marked unstable.

> Hmm, there are all kinds of races if the watchdog code gets interrupted
> by the kernel debugger. Wouldn't it be better to just disable the
> watchdog while the kernel debugger is active?

No, we can keep it and in most cases it clocksource_touch_watchdog()
helps to keep TSC alive. A simple "if (!trylock) return;" should solve
the deadlock problem for kgdb without opening a can of worms.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Thomas Gleixner on
On Tue, 26 Jan 2010, Dongdong Deng wrote:
> On Tue, Jan 26, 2010 at 4:50 PM, Thomas Gleixner <tglx(a)linutronix.de> wrote:
> > On Tue, 26 Jan 2010, Martin Schwidefsky wrote:
> >> On Mon, 25 Jan 2010 22:26:39 -0600
> >> Jason Wessel <jason.wessel(a)windriver.com> wrote:
> >>
> >> > This is a regression fix against: 0f8e8ef7c204988246da5a42d576b7fa5277a8e4
> >> >
> >> > Spin locks were added to the clocksource_resume_watchdog() which cause
> >> > the kernel debugger to deadlock on an SMP system frequently.
> >> >
> >> > The kernel debugger can try for the lock, but if it fails it should
> >> > continue to touch the clocksource watchdog anyway, else it will trip
> >> > if the general kernel execution has been paused for too long.
> >> >
> >> > This introduces an possible race condition where the kernel debugger
> >> > might not process the list correctly if a clocksource is being added
> >> > or removed at the time of this call.  This race is sufficiently rare vs
> >> > having the kernel debugger hang the kernel
> >> >
> >> > CC: Thomas Gleixner <tglx(a)linutronix.de>
> >> > CC: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
> >> > CC: John Stultz <johnstul(a)us.ibm.com>
> >> > CC: Andrew Morton <akpm(a)linux-foundation.org>
> >> > CC: Magnus Damm <damm(a)igel.co.jp>
> >> > Signed-off-by: Jason Wessel <jason.wessel(a)windriver.com>
> >>
> >> The first question I would ask is why does the kernel deadlock? Can we
> >> have a backchain of a deadlock please?
> >
> > The problem arises when the kernel is stopped inside the watchdog code
> > with watchdog_lock held. When kgdb restarts execution then it touches
> > the watchdog to avoid that TSC gets marked unstable.
> >
> >> Hmm, there are all kinds of races if the watchdog code gets interrupted
> >> by the kernel debugger. Wouldn't it be better to just disable the
> >> watchdog while the kernel debugger is active?
> >
> > No, we can keep it and in most cases it clocksource_touch_watchdog()
> > helps to keep TSC alive. A simple "if (!trylock) return;" should solve
> > the deadlock problem for kgdb without opening a can of worms.
>
> Is it possible that we reset the clocksource watchdog during in
> clocksource_watchdog() ?
>
> >From the code view, The action of reset clocksource watchdog is just
> set the CLOCK_SOURCE_WATCHDOG flag.
> thus if we reset it before using, I think the logic will be right.

No, it's not. It just brings back the old flag based logic which we
removed.

The correct way to solve this is a documented

if (!trylock())
return;

in clocksource_touch_watchdog(). And that's what I'm going to push
linuswards.

There is no sane way to reliably prevent TSC from becoming unstable
when kgdb stops the kernel inside the watchdog code. And I do not care
about that at all.

I'm not going to clutter code with crazy workarounds just because some
people believe that using a kernel debugger is a good idea. If people
insist on using kgdb then the possible "TSC becomes unstable" side
effect is the least of their problems.

Thanks,

tglx