From: Steven King on
Attach is the .config; it works on v2.6.32 but fails to boot on .33-rc1; but
if I deselect hrtimers && tickless then it works.

--
Steven King -- sfking at fdwdc dot com
From: john stultz on
On Fri, Dec 18, 2009 at 6:13 PM, Steven King <sfking(a)fdwdc.com> wrote:
> Attach is the .config; it works on v2.6.32 but fails to boot on .33-rc1; but
> if I deselect hrtimers && tickless then it works.

Sorry for the dup, forgot to cc lkml on my reply.

Fails to boot all together? Or does it hang at some point in the dmesg
that you can point out?

Could you run the following so we can narrow down which clocksource your using?
cat /sys/devices/system/clocksource/clocksource0/current_clocksource
cat /sys/devices/system/clocksource/clocksource0/available_clocksource

Then with the kernel that doesn't boot, go through the clocksources
listed in available_clocksources and try booting w/
"clocksource=<clock name>" and see if the behavior changes.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Steven King on
On Friday 18 December 2009 06:44:44 john stultz wrote:
> On Fri, Dec 18, 2009 at 6:13 PM, Steven King <sfking(a)fdwdc.com> wrote:
> > Attach is the .config; it works on v2.6.32 but fails to boot on .33-rc1;
> > but if I deselect hrtimers && tickless then it works.
>
> Sorry for the dup, forgot to cc lkml on my reply.
>
> Fails to boot all together? Or does it hang at some point in the dmesg
> that you can point out?

fails to boot all together; nothing on the serial console.
>
> Could you run the following so we can narrow down which clocksource your
> using? cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> cat /sys/devices/system/clocksource/clocksource0/available_clocksource
>
> Then with the kernel that doesn't boot, go through the clocksources
> listed in available_clocksources and try booting w/
> "clocksource=<clock name>" and see if the behavior changes.

on the working .32 kernel:

# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
pit
# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
pit

just to be sure, I tried clocksource=pit on the .33-rc1 kernel. It didnt make
any difference.

--
Steven King -- sfking at fdwdc dot com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: john stultz on
On Fri, 2009-12-18 at 19:20 -0800, Steven King wrote:
> On Friday 18 December 2009 06:44:44 john stultz wrote:
> > On Fri, Dec 18, 2009 at 6:13 PM, Steven King <sfking(a)fdwdc.com> wrote:
> > > Attach is the .config; it works on v2.6.32 but fails to boot on .33-rc1;
> > > but if I deselect hrtimers && tickless then it works.
> >
> > Sorry for the dup, forgot to cc lkml on my reply.
> >
> > Fails to boot all together? Or does it hang at some point in the dmesg
> > that you can point out?
>
> fails to boot all together; nothing on the serial console.
> >
> > Could you run the following so we can narrow down which clocksource your
> > using? cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> > cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> >
> > Then with the kernel that doesn't boot, go through the clocksources
> > listed in available_clocksources and try booting w/
> > "clocksource=<clock name>" and see if the behavior changes.
>
> on the working .32 kernel:
>
> # cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> pit
> # cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> pit
>
> just to be sure, I tried clocksource=pit on the .33-rc1 kernel. It didnt make
> any difference.


Hrmm.. So looking at the code in arch/m68knommu/platform/coldfire/pit.c,
I'm a little confused on how this got marked as a continuous clocksource
(CLOCK_SOURCE_IS_CONTINUOUS), especially as it seems it couldn't handle
skipping an interrupt.

That said, I'm not sure how it worked in 2.6.32, as its been that way
for awhile it seems. Maybe my assumptions on how the PIT works is wrong
(or just biased in how it works on x86)?

Greg, could you clarify how the PIT can be used as a clocksource if its
also being used in oneshot mode?

Steven, I assume the patch below avoids the issue (by disabling highres
timers and nohz)?

thanks
-john



The m68knommu coldfire pit clocksource looks like it was incorrectly
marked as a continuous clocksource. From the looks of it, running with
it marked as a continuous clocksource could cause hangs when the system
switches to highres mode or enables nohz. I have no idea why it worked
in prior kernels, and I'm not 100% sure the following fix is really the
right solution.

This patch removes the CLOCK_SOURCE_IS_CONTINUOUS flag on the coldfire
pit clocksource. This will disallow systems using this clocksource from
entering oneshot mode (disabling highres timers and nohz).

Signed-off-by: John Stultz <johnstul(a)us.ibm.com>

---

diff --git a/arch/m68knommu/platform/coldfire/pit.c b/arch/m68knommu/platform/coldfire/pit.c
index d8720ee..aebea19 100644
--- a/arch/m68knommu/platform/coldfire/pit.c
+++ b/arch/m68knommu/platform/coldfire/pit.c
@@ -146,7 +146,6 @@ static struct clocksource pit_clk = {
.read = pit_read_clk,
.shift = 20,
.mask = CLOCKSOURCE_MASK(32),
- .flags = CLOCK_SOURCE_IS_CONTINUOUS,
};

/***************************************************************************/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Steven King on
On Friday 18 December 2009 08:04:23 john stultz wrote:
> On Fri, 2009-12-18 at 19:20 -0800, Steven King wrote:
> > On Friday 18 December 2009 06:44:44 john stultz wrote:
> > > On Fri, Dec 18, 2009 at 6:13 PM, Steven King <sfking(a)fdwdc.com> wrote:
> > > > Attach is the .config; it works on v2.6.32 but fails to boot on
> > > > .33-rc1; but if I deselect hrtimers && tickless then it works.
> > >
> > > Sorry for the dup, forgot to cc lkml on my reply.
> > >
> > > Fails to boot all together? Or does it hang at some point in the dmesg
> > > that you can point out?
> >
> > fails to boot all together; nothing on the serial console.
> >
> > > Could you run the following so we can narrow down which clocksource
> > > your using? cat
> > > /sys/devices/system/clocksource/clocksource0/current_clocksource cat
> > > /sys/devices/system/clocksource/clocksource0/available_clocksource
> > >
> > > Then with the kernel that doesn't boot, go through the clocksources
> > > listed in available_clocksources and try booting w/
> > > "clocksource=<clock name>" and see if the behavior changes.
> >
> > on the working .32 kernel:
> >
> > # cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> > pit
> > # cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > pit
> >
> > just to be sure, I tried clocksource=pit on the .33-rc1 kernel. It didnt
> > make any difference.
>
> Hrmm.. So looking at the code in arch/m68knommu/platform/coldfire/pit.c,
> I'm a little confused on how this got marked as a continuous clocksource
> (CLOCK_SOURCE_IS_CONTINUOUS), especially as it seems it couldn't handle
> skipping an interrupt.
>
> That said, I'm not sure how it worked in 2.6.32, as its been that way
> for awhile it seems. Maybe my assumptions on how the PIT works is wrong
> (or just biased in how it works on x86)?
>
> Greg, could you clarify how the PIT can be used as a clocksource if its
> also being used in oneshot mode?
>
> Steven, I assume the patch below avoids the issue (by disabling highres
> timers and nohz)?

Yes.

I suspect it wasnt working correctly on earlier kernels, we just got away with
it; I had recently added ntpclient to this target but the time reported by
date was always off by some odd amount, I had assume that it was a busybox or
ntpclient issue but hadnt gotten around to tracking it down. With your patch
(or, as I just now verified, on .32 without no_hz and hrtimers) the system
time is now correct. I probably never would have made the connection.

Thank you John!

--
Steven King -- sfking at fdwdc dot com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/