From: Thomas Gleixner on
On Mon, 25 Jan 2010, Jason Wessel wrote:
> This is a regression fix against: 0f8e8ef7c204988246da5a42d576b7fa5277a8e4
>
> Spin locks were added to the clocksource_resume_watchdog() which cause
> the kernel debugger to deadlock on an SMP system frequently.
>
> The kernel debugger can try for the lock, but if it fails it should
> continue to touch the clocksource watchdog anyway, else it will trip
> if the general kernel execution has been paused for too long.
>
> This introduces an possible race condition where the kernel debugger
> might not process the list correctly if a clocksource is being added
> or removed at the time of this call. This race is sufficiently rare vs
> having the kernel debugger hang the kernel
>
> CC: Thomas Gleixner <tglx(a)linutronix.de>
> CC: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
> CC: John Stultz <johnstul(a)us.ibm.com>
> CC: Andrew Morton <akpm(a)linux-foundation.org>
> CC: Magnus Damm <damm(a)igel.co.jp>
> Signed-off-by: Jason Wessel <jason.wessel(a)windriver.com>
> ---
> kernel/time/clocksource.c | 7 ++++++-
> 1 files changed, 6 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index e85c234..74f9ba6 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -463,7 +463,12 @@ void clocksource_resume(void)
> */
> void clocksource_touch_watchdog(void)
> {
> - clocksource_resume_watchdog();
> + unsigned long flags;
> +
> + int got_lock = spin_trylock_irqsave(&watchdog_lock, flags);
> + clocksource_reset_watchdog();
> + if (got_lock)
> + spin_unlock_irqrestore(&watchdog_lock, flags);

Just for the record. This patch would not compile on any platform
which has CONFIG_CLOCKSOURCE_WATCHDOG=n

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Thomas Gleixner on
On Tue, 26 Jan 2010, Thomas Gleixner wrote:
> There is no sane way to reliably prevent TSC from becoming unstable
> when kgdb stops the kernel inside the watchdog code. And I do not care
> about that at all.
>
> I'm not going to clutter code with crazy workarounds just because some
> people believe that using a kernel debugger is a good idea. If people
> insist on using kgdb then the possible "TSC becomes unstable" side
> effect is the least of their problems.

Btw, if the kernel uses tick based timekeeping or a clock source which
wraps in rather short intervals (e.g. pm-timer wraps after ~4.6
seconds), stopping the kernel with kgdb will inevitably screw up time
keeping anyway.

So there is really no reason to worry about TSC becoming unstable.

There is only one real sensible solution for this:

Do _not_ use kgdb - which is the modus operandi of every sane kernel
developer on the planet.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/