From: Esben Nielsen on


On Tue, 8 May 2007, Peter Williams wrote:

> Esben Nielsen wrote:
>>
>>
>> On Sun, 6 May 2007, Linus Torvalds wrote:
>>
>> >
>> >
>> > On Sun, 6 May 2007, Ingo Molnar wrote:
>> > >
>> > > * Linus Torvalds <torvalds(a)linux-foundation.org> wrote:
>> > >
>> > > > So the _only_ valid way to handle timers is to
>> > > > - either not allow wrapping at all (in which case "unsigned" is
>> > > > better,
>> > > > since it is bigger)
>> > > > - or use wrapping explicitly, and use unsigned arithmetic (which is
>> > > > well-defined in C) and do something like "(long)(a-b) > 0".
>> > >
>> > > hm, there is a corner-case in CFS where a fix like this is necessary.
>> > >
>> > > CFS uses 64-bit values for almost everything, and the majority of
>> > > values
>> > > are of 'relative' nature with no danger of overflow. (They are signed
>> > > because they are relative values that center around zero and can be
>> > > negative or positive.)
>> >
>> > Well, I'd like to just worry about that for a while.
>> >
>> > You say there is "no danger of overflow", and I mostly agree that once
>> > we're talking about 64-bit values, the overflow issue simply doesn't
>> > exist, and furthermore the difference between 63 and 64 bits is not
>> > really
>> > relevant, so there's no major reason to actively avoid signed entries.
>> >
>> > So in that sense, it all sounds perfectly sane. And I'm definitely not
>> > sure your "292 years after bootup" worry is really worth even
>> > considering.
>> >
>>
>> I would hate to tell mission control for Mankind's first mission to
>> another
>> star to reboot every 200 years because "there is no need to worry about
>> it."
>>
>> As a matter of principle an OS should never need a reboot (with exception
>> for upgrading). If you say you have to reboot every 200 years, why not
>> every 100? Every 50? .... Every 45 days (you know what I am referring to
>> :-) ?
>
> There's always going to be an upper limit on the representation of time.
> At least until we figure out how to implement infinity properly.

Well you need infinite memory for that :-)
But that is my point: Why go into the problem of storing absolute time
when you can use relative time?


>
>>
>> > When we're really so well off that we expect the hardware and software
>> > stack to be stable over a hundred years, I'd start to think about issues
>> > like that, in the meantime, to me worrying about those kinds of issues
>> > just means that you're worrying about the wrong things.
>> >
>> > BUT.
>> >
>> > There's a fundamental reason relative timestamps are difficult and
>> > almost
>> > always have overflow issues: the "long long in the future" case as an
>> > approximation of "infinite timeout" is almost always relevant.
>> >
>> > So rather than worry about the system staying up 292 years, I'd worry
>> > about whether people pass in big numbers (like some MAX_S64
>> > approximation)
>> > as an approximation for "infinite", and once you have things like that,
>> > the "64 bits never overflows" argument is totally bogus.
>> >
>> > There's a damn good reason for using only *absolute* time. The whole
>> > "signed values of relative time" may _sound_ good, but it really sucks
>> > in
>> > subtle and horrible ways!
>> >
>>
>> I think you are wrong here. The only place you need absolute time is a for
>> the clock (CLOCK_REALTIME). You waste CPU using a 64 bit
>> representation when you could have used a 32 bit. With a 32 bit
>> implementation you are forced to handle the corner cases with wrap around
>> and too big arguments up front. With a 64 bit you hide those problems.
>
> As does the other method. A 32 bit signed offset with a 32 bit base is just
> a crude version of 64 bit absolute time.

64 bit is also relative - just over a much longer period.
32 bit signed offset is relative - and you know it. But with 64 people
think it is absolute and put in large values as Linus said above. With 32
bit future developers will know it is relative and code for it. And they
will get their corner cases tested, because the code soon will run
into those corners.

>
>>
>> I think CFS would be best off using a 32 bit timer counting in micro
>> seconds. That would wrap around in 72 minuttes. But as the timers are
>> relative you will never be able to specify a timer larger than 36 minuttes
>> in the future. But 36 minuttes is redicolously long for a scheduler and a
>> simple test limiting time values to that value would not break anything.
>
> Except if you're measuring sleep times. I think that you'll find lots of
> tasks sleep for more than 72 minutes.

I don't think those large values will be relavant. You can easily cut
off sleep times at 30 min or even 1 min. But you need to detect that the
task have indeed been sleeping 2^32+1 usec and not 1 usec. You can't do
with a 32 bit timer alone. In that case you need to use a (at least) 64 bit
timer - which is needed in the OS anyways. But not internally in the
wait queue, where the repeated calculations are.

Esben


>
> Peter
> --
> Peter Williams pwil3058(a)bigpond.net.au
>
> "Learning, n. The kind of ignorance distinguishing the studious."
> -- Ambrose Bierce
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Johannes Stezenbach on
On Tue, May 08, 2007, Esben Nielsen wrote:
>
> This is contrary to C99 standeard annex H2.2
> (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf):
>
> "An implementation that defines signed integer types as also being modulo
> need
> not detect integer overflow, in which case, only integer divide-by-zero need
> be detected."
>
> So if it doesn't properly defines wrapping it has to detect integer
> overflow, right?

No. Annex H (informative!) only talks about LIA-1 conformance.

C99 isn't LIA-1 conformant. H2.2 describes what an implementation
might do to make signed integers LIA-1 compatible, which is
what gcc does with -fwarpv or -ftrapv.

At least that's how I understand it, the C99 standard
seems to have been written with the "it was hard to
write, so it should be hard to read" mindset. :-/

I still don't know _why_ signed integer overflow behaviour
isn't defined in C. It just goes against everyones expectation
and thus causes bugs.


Johannes
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Esben Nielsen on


On Tue, 8 May 2007, Johannes Stezenbach wrote:

> On Tue, May 08, 2007, Esben Nielsen wrote:
>>
>> This is contrary to C99 standeard annex H2.2
>> (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf):
>>
>> "An implementation that defines signed integer types as also being modulo
>> need
>> not detect integer overflow, in which case, only integer divide-by-zero need
>> be detected."
>>
>> So if it doesn't properly defines wrapping it has to detect integer
>> overflow, right?
>
> No. Annex H (informative!) only talks about LIA-1 conformance.
>
> C99 isn't LIA-1 conformant. H2.2 describes what an implementation
> might do to make signed integers LIA-1 compatible.

"The signed C integer types int, long int, long long int, and the
corresponding unsigned types are compatible with LIA-1."

I read this as any C99 implementation must be compatible. I would like to
see LIA-1 to check.


>, which is
> what gcc does with -fwarpv or -ftrapv.
>

Yes, either or: Either wrap or trap.

> At least that's how I understand it, the C99 standard
> seems to have been written with the "it was hard to
> write, so it should be hard to read" mindset. :-/
>
> I still don't know _why_ signed integer overflow behaviour
> isn't defined in C. It just goes against everyones expectation
> and thus causes bugs.

Because it is hard to make wrapping work on non twos complement
architectures. Then it is easier to trap.

Esben

>
>
> Johannes
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Williams on
Esben Nielsen wrote:
>
>
> On Tue, 8 May 2007, Peter Williams wrote:
>
>> Esben Nielsen wrote:
>>>
>>>
>>> On Sun, 6 May 2007, Linus Torvalds wrote:
>>>
>>> > > > On Sun, 6 May 2007, Ingo Molnar wrote:
>>> > > > > * Linus Torvalds <torvalds(a)linux-foundation.org> wrote:
>>> > > > > > So the _only_ valid way to handle timers is to
>>> > > > - either not allow wrapping at all (in which case "unsigned"
>>> is > > > better,
>>> > > > since it is bigger)
>>> > > > - or use wrapping explicitly, and use unsigned arithmetic
>>> (which is
>>> > > > well-defined in C) and do something like "(long)(a-b) > 0".
>>> > > > > hm, there is a corner-case in CFS where a fix like this is
>>> necessary.
>>> > > > > CFS uses 64-bit values for almost everything, and the
>>> majority of > > values
>>> > > are of 'relative' nature with no danger of overflow. (They are
>>> signed
>>> > > because they are relative values that center around zero and can be
>>> > > negative or positive.)
>>> > > Well, I'd like to just worry about that for a while.
>>> > > You say there is "no danger of overflow", and I mostly agree
>>> that once
>>> > we're talking about 64-bit values, the overflow issue simply doesn't
>>> > exist, and furthermore the difference between 63 and 64 bits is
>>> not > really
>>> > relevant, so there's no major reason to actively avoid signed
>>> entries.
>>> > > So in that sense, it all sounds perfectly sane. And I'm
>>> definitely not
>>> > sure your "292 years after bootup" worry is really worth even >
>>> considering.
>>> >
>>>
>>> I would hate to tell mission control for Mankind's first mission to
>>> another
>>> star to reboot every 200 years because "there is no need to worry about
>>> it."
>>>
>>> As a matter of principle an OS should never need a reboot (with
>>> exception
>>> for upgrading). If you say you have to reboot every 200 years, why not
>>> every 100? Every 50? .... Every 45 days (you know what I am
>>> referring to
>>> :-) ?
>>
>> There's always going to be an upper limit on the representation of time.
>> At least until we figure out how to implement infinity properly.
>
> Well you need infinite memory for that :-)
> But that is my point: Why go into the problem of storing absolute time
> when you can use relative time?

I'd reverse that and say "Why go to the bother of using relative time
when you can use absolute time?". Absolute time being time since boot,
of course.

>
>
>>
>>>
>>> > When we're really so well off that we expect the hardware and
>>> software
>>> > stack to be stable over a hundred years, I'd start to think about
>>> issues
>>> > like that, in the meantime, to me worrying about those kinds of
>>> issues
>>> > just means that you're worrying about the wrong things.
>>> > > BUT.
>>> > > There's a fundamental reason relative timestamps are difficult
>>> and > almost
>>> > always have overflow issues: the "long long in the future" case as an
>>> > approximation of "infinite timeout" is almost always relevant.
>>> > > So rather than worry about the system staying up 292 years, I'd
>>> worry
>>> > about whether people pass in big numbers (like some MAX_S64 >
>>> approximation)
>>> > as an approximation for "infinite", and once you have things like
>>> that,
>>> > the "64 bits never overflows" argument is totally bogus.
>>> > > There's a damn good reason for using only *absolute* time. The
>>> whole
>>> > "signed values of relative time" may _sound_ good, but it really
>>> sucks > in
>>> > subtle and horrible ways!
>>> >
>>>
>>> I think you are wrong here. The only place you need absolute time is
>>> a for
>>> the clock (CLOCK_REALTIME). You waste CPU using a 64 bit
>>> representation when you could have used a 32 bit. With a 32 bit
>>> implementation you are forced to handle the corner cases with wrap
>>> around
>>> and too big arguments up front. With a 64 bit you hide those problems.
>>
>> As does the other method. A 32 bit signed offset with a 32 bit base
>> is just a crude version of 64 bit absolute time.
>
> 64 bit is also relative - just over a much longer period.

Yes, relative to boot.

> 32 bit signed offset is relative - and you know it. But with 64 people
> think it is absolute and put in large values as Linus said above.

What people? Who gets to feed times into the scheduler? Isn't it just
using the time as determined by the system?

> With
> 32 bit future developers will know it is relative and code for it. And
> they will get their corner cases tested, because the code soon will run
> into those corners.
>
>>
>>>
>>> I think CFS would be best off using a 32 bit timer counting in micro
>>> seconds. That would wrap around in 72 minuttes. But as the timers are
>>> relative you will never be able to specify a timer larger than 36
>>> minuttes
>>> in the future. But 36 minuttes is redicolously long for a scheduler
>>> and a
>>> simple test limiting time values to that value would not break
>>> anything.
>>
>> Except if you're measuring sleep times. I think that you'll find lots
>> of tasks sleep for more than 72 minutes.
>
> I don't think those large values will be relavant. You can easily cut
> off sleep times at 30 min or even 1 min.

The aim is to make the code as simple as possible not add this kind of
rubbish and 1 minute would be far too low.

> But you need to detect that the
> task have indeed been sleeping 2^32+1 usec and not 1 usec. You can't do
> with a 32 bit timer alone. In that case you need to use a (at least) 64 bit
> timer - which is needed in the OS anyways. But not internally in the
> wait queue, where the repeated calculations are.
>

Peter
--
Peter Williams pwil3058(a)bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Pavel Machek on
Hi!

> >>You say there is "no danger of overflow", and I mostly
> >>agree that once
> >>we're talking about 64-bit values, the overflow issue
> >>simply doesn't
> >>exist, and furthermore the difference between 63 and
> >>64 bits is not really
> >>relevant, so there's no major reason to actively avoid
> >>signed entries.
> >>
> >>So in that sense, it all sounds perfectly sane. And
> >>I'm definitely not
> >>sure your "292 years after bootup" worry is really
> >>worth even considering.
> >>
> >
> >I would hate to tell mission control for Mankind's
> >first mission to another
> >star to reboot every 200 years because "there is no
> >need to worry about it."
> >
> >As a matter of principle an OS should never need a
> >reboot (with exception for upgrading). If you say you
> >have to reboot every 200 years, why not every 100?
> >Every 50? .... Every 45 days (you know what I am
> >referring to :-) ?
>
> There's always going to be an upper limit on the
> representation of time. At least until we figure out
> how to implement infinity properly.

There's also upper limit on life time of this universe. 1000 bits is
certainly enough to represent that in u-seconds.

Also notice that current cpus were not designed to work 300 years.
When we have hw designed for 50 years+, we can start to worry.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/