From: john stultz on
On Thu, 2010-08-05 at 07:28 -0500, Jason Wessel wrote:
> The tv_nsec is a long and when added to the shift value it can wrap
> and become negative which later causes looping problems in the
> getrawmonotonic(). The edge case occurs when the system has slept for
> a short period of time of ~2 seconds.

Ah, good catch!

I reworked some of the variable names to make a little more sense and
simplified the accumulation. Do you mind giving this a test in your
environment that triggered the issue to make sure nothing else slipped
in?

thanks
-john



>From 512349b1f7ab0d9b6dff5e33bf4820a50e79f862 Mon Sep 17 00:00:00 2001
From: Jason Wessel <jason.wessel(a)windriver.com>
Date: Thu, 5 Aug 2010 07:28:32 -0500
Subject: [PATCH] timekeeping: Fix overflow in rawtime tv_nsec on 32 bit archs

The tv_nsec is a long and when added to the shifted interval it can wrap
and become negative which later causes looping problems in the
getrawmonotonic(). The edge case occurs when the system has slept for
a short period of time of ~2 seconds.

A trace printk of the values in this patch illustrate the problem:

ftrace time stamp: log
43.716079: logarithmic_accumulation: raw: 3d0913 tv_nsec d687faa
43.718513: logarithmic_accumulation: raw: 3d0913 tv_nsec da588bd
43.722161: logarithmic_accumulation: raw: 3d0913 tv_nsec de291d0
46.349925: logarithmic_accumulation: raw: 7a122600 tv_nsec e1f9ae3
46.349930: logarithmic_accumulation: raw: 1e848980 tv_nsec 8831c0e3

The kernel starts looping at 46.349925 in the getrawmonotonic() due to
the negative value from adding the raw value to tv_nsec.

A simple solution is to accumulate into a u64, and then normalize it
to a timespec_t.

Signed-off-by: Jason Wessel <jason.wessel(a)windriver.com>

Reworked variable names and sipmlified some of the code.

Signed-off-by: John Stultz <johnstul(a)us.ibm.com>

CC: Thomas Gleixner <tglx(a)linutronix.de>
CC: H. Peter Anvin <hpa(a)zytor.com>
---
kernel/time/timekeeping.c | 11 +++++++----
1 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index caf8d4d..6603860 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -736,6 +736,7 @@ static void timekeeping_adjust(s64 offset)
static cycle_t logarithmic_accumulation(cycle_t offset, int shift)
{
u64 nsecps = (u64)NSEC_PER_SEC << timekeeper.shift;
+ u64 raw_nsecs;

/* If the offset is smaller then a shifted interval, do nothing */
if (offset < timekeeper.cycle_interval<<shift)
@@ -752,12 +753,14 @@ static cycle_t logarithmic_accumulation(cycle_t offset, int shift)
second_overflow();
}

- /* Accumulate into raw time */
- raw_time.tv_nsec += timekeeper.raw_interval << shift;;
- while (raw_time.tv_nsec >= NSEC_PER_SEC) {
- raw_time.tv_nsec -= NSEC_PER_SEC;
+ /* Accumulate raw time */
+ raw_nsecs = timekeeper.raw_interval << shift;
+ raw_nsecs += raw_time.tv_nsec;
+ while (raw_nsecs >= NSEC_PER_SEC) {
+ raw_nsecs -= NSEC_PER_SEC;
raw_time.tv_sec++;
}
+ raw_time.tv_nsec = raw_nsecs;

/* Accumulate error between NTP and clock interval */
timekeeper.ntp_error += tick_length << shift;
--
1.6.0.4



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jason Wessel on
On 08/05/2010 05:17 PM, john stultz wrote:
> On Thu, 2010-08-05 at 07:28 -0500, Jason Wessel wrote:
>> The tv_nsec is a long and when added to the shift value it can wrap
>> and become negative which later causes looping problems in the
>> getrawmonotonic(). The edge case occurs when the system has slept for
>> a short period of time of ~2 seconds.
>
> Ah, good catch!
>
> I reworked some of the variable names to make a little more sense and
> simplified the accumulation. Do you mind giving this a test in your
> environment that triggered the issue to make sure nothing else slipped
> in?
>


No problem.


This looks good to me. I even increased the delay and I can see it recovers properly.

The instrumentation shows raw_nsecs would have otherwise been negative going from 90.* to 97.* in the log.

<...>-4801 [000] 90.105084: update_wall_time: raw_nsecs: 37283ea1
<...>-4801 [000] 90.109078: update_wall_time: raw_nsecs: 376547b4
<...>-4801 [000] 97.694264: update_wall_time: raw_nsecs: b1776db4
<...>-4801 [000] 97.694270: update_wall_time: raw_nsecs: b453ffb4
<...>-4801 [000] 97.694272: update_wall_time: raw_nsecs: 7b95c7b4

Note that I had instrumented it just after:

raw_nsecs += raw_time.tv_nsec;

We should send this over to -stable when it is considered baked because this was found in the 2.6.35 and may be a problem elsewhere as well.

Thanks,
Jason.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/