From: john stultz on
On Wed, 2009-11-04 at 13:28 -0800, Dan Magenheimer wrote:
> > From: john stultz [mailto:johnstul(a)us.ibm.com]
> > On Thu, Oct 29, 2009 at 7:07 AM, Avi Kivity <avi(a)redhat.com> wrote:
> > >
> > > Out of interest, do you know (and can you relate) why those
> > apps need
> > > 100k/sec monotonically increasing timestamps?
> >
> > This is sort of tangential, but depending on the need, this might be
> > of interest: Recently I've added a new clock_id,
> > CLOCK_MONOTONIC_COARSE (as well as CLOCK_REALTIME_COARSE), which
> > return a HZ granular timestamp (same granularity as filesystem
> > timestamps). Its very fast to access, since there's no hardware to
> > touch, and is accessible via vsyscall.
> >
> > The idea being, if your hitting clock_gettime 100k/sec but you really
> > don't have the need for nsec granular timestamps, it might provide a
> > really nice performance boost.
> >
> > Here's the commit:
>
> Hi John --
>
> Yes, possibly of interest. But does it work with CONFIG_NO_HZ?
> (I'm expecting that over time NO_HZ will become widespread
> for VM OS's, though interested in if you agree.)

It should work, with CONFIG_NO_HZ, as soon as we come out of a long idle
(likely due to a timer tick), the timekeeping code will accumulate all
the skipped ticks.

If we ever get to non-idle NOHZ, we'll need some extra work here
(probably lazy accumulation done conditionally in the read path), but
that's also true for filesystem timestamps.


> Also very interested in your thoughts about a variation
> that returns something similar to a TSC_AUX to notify
> caller that the underlying reference clock has/may have
> changed.

I haven't been following that closely. Personally, experience makes me
skeptical of workarounds for unsynced TSCs. But I'm sure there's sharper
folks out there that might make it work. The kernel just requires that
it *really really* works, and not "mostly" works. :)

thanks
-john




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dan Magenheimer on
> > Yes, possibly of interest. But does it work with CONFIG_NO_HZ?
> > (I'm expecting that over time NO_HZ will become widespread
> > for VM OS's, though interested in if you agree.)
>
> It should work, with CONFIG_NO_HZ, as soon as we come out of
> a long idle
> (likely due to a timer tick), the timekeeping code will accumulate all
> the skipped ticks.
>
> If we ever get to non-idle NOHZ, we'll need some extra work here
> (probably lazy accumulation done conditionally in the read path), but
> that's also true for filesystem timestamps.

OK, sounds good.

> > Also very interested in your thoughts about a variation
> > that returns something similar to a TSC_AUX to notify
> > caller that the underlying reference clock has/may have
> > changed.
>
> I haven't been following that closely. Personally, experience makes me
> skeptical of workarounds for unsynced TSCs. But I'm sure
> there's sharper
> folks out there that might make it work. The kernel just requires that
> it *really really* works, and not "mostly" works. :)

This is less a workaround for unsynced TSCs than it
is for VM migration (and maybe also time where a
VM is out-of-context or moved to a different pcpu)
though it could probably
be made to work on unsynced TSC boxes also.
Basically an application needing hi-res profiling
info would do:

nsec1 = clock_gettime2(MONOTONIC,&aux1);
(time passes)
nsec2 = clock_gettime2(MONOTONIC,&aux2);
if (aux1 != aux2)
discard_measurement();
else
use_measurement(nsec2-nsec1);

and system software (hypervisor or kernel or
both) is responsible for ensuring aux value
monotonically increases whenever a different
crystal is used.

Without something like this as a vsyscall,
apps will just use rdtscp (which must be emulated
to work properly across a migration).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Keir Fraser on
On 05/11/2009 14:52, "Dan Magenheimer" <dan.magenheimer(a)oracle.com> wrote:

> Well, all this discussion has convince me that
> my original proposals do make sense

You surprise me, Dan. ;-)

-- Keir


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/