From: Frederic Weisbecker on
On Wed, Aug 04, 2010 at 09:53:03AM +0800, Li Zefan wrote:
> CONFIG_DETECT_SOFTLOCKUP has been removed, so switch the
> default value to LOCKUP_DETECTOR.
>
> Also fix the help text of BOOT_PRINTK_DELAY.
>
> Signed-off-by: Li Zefan <lizf(a)cn.fujitsu.com>
> ---


Acked-by: Frederic Weisbecker <fweisbec(a)gmail.com>

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on
(More Cc)

On Sat, Aug 07, 2010 at 09:01:35AM +0200, Ingo Molnar wrote:
>
> (Linus Cc:-ed)
>
> * Frederic Weisbecker <fweisbec(a)gmail.com> wrote:
>
> > On Wed, Aug 04, 2010 at 09:53:03AM +0800, Li Zefan wrote:
> > > CONFIG_DETECT_SOFTLOCKUP has been removed, so switch the
> > > default value to LOCKUP_DETECTOR.
> > >
> > > Also fix the help text of BOOT_PRINTK_DELAY.
> > >
> > > Signed-off-by: Li Zefan <lizf(a)cn.fujitsu.com>
> > > ---
> >
> >
> > Acked-by: Frederic Weisbecker <fweisbec(a)gmail.com>
> >
> > Thanks.
>
> The thing is, CONFIG_DETECT_SOFTLOCKUP was default-y before, so many people
> had it enabled [and had it forced-enabled if DEBUG_KERNEL was off], even if
> they didnt really want or need it.



Hmm. It was:

config DETECT_SOFTLOCKUP
bool "Detect Soft Lockups"
depends on DEBUG_KERNEL && !S390
default y


It means it's default enabled only if DEBUG_KERNEL, right?
Then if you don't select CONFIG_DEBUG_KERNEL, it's fine as it won't
be selected.

But I agree with you. There is a bunch of config options for which
selection is a duty when you are a kernel developer:
PROVE_LOCKING, DETECT_HUNG_TASK, DEBUG_PREEMPT, PROVE_RCU, etc...
Because they all show (or prove we can have) bugs that one might miss
without these options. Softlockups are rarely part of them because even
without the lockup detector enabled, you'll observe something is wrong.



> So i turned off the new generic watchdog code's default intentionally - as it
> clearly does not cure cancer ;-)


:-)



> I think distros will enable it, and most testers will as well. Those who dont
> enable it and run into a lockup have an easy option to enable.



Why distros would want to enable it? The lockup detector introduces overhead.



> Maybe a better change would be to make it more generally available - right now
> it's:
>
> config LOCKUP_DETECTOR
> bool "Detect Hard and Soft Lockups"
> depends on DEBUG_KERNEL && !S390
>
> which means that it cannot be enabled when DEBUG_KERNEL is off.
>
> So i think we should:
>
> - Remove the s390 hack and add an ARCH_HAS_LOCKUP_DETECTOR flag



If we do this, we'll need to add this config on every archs but s390.
We should better have ARCH_WANT_NO_LOCKUP_DETECTOR. I know that
"negative" meaning configs suck, but otherwise we would lose this
support on many archs.

Why s390 doesn't want the softlockup detector to begin with?



> - Remove the DEBUG_KERNEL dependency


Yeah.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Frederic Weisbecker on
On Sun, Aug 08, 2010 at 11:23:11PM +0200, Ingo Molnar wrote:
>
> * Frederic Weisbecker <fweisbec(a)gmail.com> wrote:
>
> > > The thing is, CONFIG_DETECT_SOFTLOCKUP was default-y before, so many
> > > people had it enabled [and had it forced-enabled if DEBUG_KERNEL was off],
> > > even if they didnt really want or need it.
> >
> > Hmm. It was:
> >
> > config DETECT_SOFTLOCKUP
> > bool "Detect Soft Lockups"
> > depends on DEBUG_KERNEL && !S390
> > default y
> >
> > It means it's default enabled only if DEBUG_KERNEL, right? Then if you don't
> > select CONFIG_DEBUG_KERNEL, it's fine as it won't be selected.
>
> Indeed, you are right.
>
> Anyway, i think the general point remains: i'm not sure we should
> default-enable this feature.



Yeah, right.



> > But I agree with you. There is a bunch of config options for which selection
> > is a duty when you are a kernel developer: PROVE_LOCKING, DETECT_HUNG_TASK,
> > DEBUG_PREEMPT, PROVE_RCU, etc... Because they all show (or prove we can
> > have) bugs that one might miss without these options. Softlockups are rarely
> > part of them because even without the lockup detector enabled, you'll
> > observe something is wrong.
>
> Note that it's now detecting all kinds of lockups: softlockups, hard lockups
> and even unkillable hung tasks.
>
> Ingo



The unkillable hung task detector remains seperate. May be from the config point
of view it could be joined, but from an implementation point of view it has too
few to share with the lockup detector: it doesn't need a real time task, nor
a timer, etc...

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Heiko Carstens on
On Sun, Aug 08, 2010 at 09:58:42PM +0200, Frederic Weisbecker wrote:
> > Maybe a better change would be to make it more generally available - right now
> > it's:
> >
> > config LOCKUP_DETECTOR
> > bool "Detect Hard and Soft Lockups"
> > depends on DEBUG_KERNEL && !S390
> >
> > which means that it cannot be enabled when DEBUG_KERNEL is off.
> >
> > So i think we should:
> >
> > - Remove the s390 hack and add an ARCH_HAS_LOCKUP_DETECTOR flag
>
>
>
> If we do this, we'll need to add this config on every archs but s390.
> We should better have ARCH_WANT_NO_LOCKUP_DETECTOR. I know that
> "negative" meaning configs suck, but otherwise we would lose this
> support on many archs.
>
> Why s390 doesn't want the softlockup detector to begin with?

If I remember correctly then we disabled that back then because we got
false positives. The reason for those were that the softlockup detector
did not take steal time into account.
E.g. if a guest cpu runs for 10 seconds, but the hypervisor would steal
9 seconds in order to run other guest cpus this specific cpu would still
think it ran for 10 seconds and therefore would generate invalid warnings.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/