From: john stultz on
On Wed, Feb 24, 2010 at 4:28 AM, Alexander Gordeev
<lasaine(a)lvk.cs.msu.su> wrote:
> This patchset is tested against the vanilla 2.6.32.9 kernel. But we are
> actually using it on 2.6.31.12-rt20 rt-preempt kernel most of the time.
> Also there is a version which should be applied on top of LinuxPPS out
> of tree patches (i.e. all clients and low-level irq timestamps stuff).
> Those who are interested in other versions of the patchset can find
> them in my git repository:
> http://lvk.cs.msu.su/~lasaine/timesync/linux-2.6-timesync.git
>
> There is one problem however: hardpps() works bad when used on top
> of 2.6.33-rc* with CONFIG_NO_HZ enabled. The reason for this is commit
> a092ff0f90cae22b2ac8028ecd2c6f6c1a9e4601. Without it hardpps() is able
> to sync to 1us precision in about 10 seconds. With it

Uh. Not sure I see right off why the logarithmic time accumulation
would give you troubles. Its actually there to try to fix a couple of
NTP issues that cropped up when the accumulation interval was pushed
out to 2HZ with CONFIG_NO_HZ.

Do you have any extra insight here as to whats going on with your
code? The only thing I could guess would be second_overflow() is
happening closer to the actual overflow, but maybe less regularly? But
again, I'm not sure how this would be drastically different then
before with the 2HZ accumulation period.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alexander Gordeev on
Hi John,

Sorry for the delay...

В Mon, 8 Mar 2010 19:25:07 -0800
john stultz <johnstul(a)us.ibm.com> пишет:

> On Wed, Feb 24, 2010 at 4:28 AM, Alexander Gordeev
> <lasaine(a)lvk.cs.msu.su> wrote:
> > This patchset is tested against the vanilla 2.6.32.9 kernel. But we
> > are actually using it on 2.6.31.12-rt20 rt-preempt kernel most of
> > the time. Also there is a version which should be applied on top of
> > LinuxPPS out of tree patches (i.e. all clients and low-level irq
> > timestamps stuff). Those who are interested in other versions of
> > the patchset can find them in my git repository:
> > http://lvk.cs.msu.su/~lasaine/timesync/linux-2.6-timesync.git
> >
> > There is one problem however: hardpps() works bad when used on top
> > of 2.6.33-rc* with CONFIG_NO_HZ enabled. The reason for this is
> > commit a092ff0f90cae22b2ac8028ecd2c6f6c1a9e4601. Without it
> > hardpps() is able to sync to 1us precision in about 10 seconds.
> > With it
>
> Uh. Not sure I see right off why the logarithmic time accumulation
> would give you troubles. Its actually there to try to fix a couple of
> NTP issues that cropped up when the accumulation interval was pushed
> out to 2HZ with CONFIG_NO_HZ.

Yes, I know. I guess (based on the commit log and other sources) that
this change was added to make chrony work better on tickless kernel.
So chrony corrects the time using only frequency corrections?
My approach is different: use time_offset t remove the phase error and
adjust time_freq to remove frequency error.

> Do you have any extra insight here as to whats going on with your
> code? The only thing I could guess would be second_overflow() is
> happening closer to the actual overflow, but maybe less regularly? But
> again, I'm not sure how this would be drastically different then
> before with the 2HZ accumulation period.

I still can't find this out (partially because I'm too busy with other
tasks). The new code seems ok to mee. time_offset is added at
second_overflow as usual. Maybe the problem is with the frequency
correction. I'm going to run some tests that should show where the
problem is: in the phase or freq correction.

I hope I'll have time for this next week.

--
Alexander
From: john stultz on
On Mon, 2010-03-22 at 23:42 +0300, Alexander Gordeev wrote:
> Hi John,
>
> Sorry for the delay...
>
> В Mon, 8 Mar 2010 19:25:07 -0800
> john stultz <johnstul(a)us.ibm.com> пишет:
>
> > On Wed, Feb 24, 2010 at 4:28 AM, Alexander Gordeev
> > <lasaine(a)lvk.cs.msu.su> wrote:
> > > This patchset is tested against the vanilla 2.6.32.9 kernel. But we
> > > are actually using it on 2.6.31.12-rt20 rt-preempt kernel most of
> > > the time. Also there is a version which should be applied on top of
> > > LinuxPPS out of tree patches (i.e. all clients and low-level irq
> > > timestamps stuff). Those who are interested in other versions of
> > > the patchset can find them in my git repository:
> > > http://lvk.cs.msu.su/~lasaine/timesync/linux-2.6-timesync.git
> > >
> > > There is one problem however: hardpps() works bad when used on top
> > > of 2.6.33-rc* with CONFIG_NO_HZ enabled. The reason for this is
> > > commit a092ff0f90cae22b2ac8028ecd2c6f6c1a9e4601. Without it
> > > hardpps() is able to sync to 1us precision in about 10 seconds.
> > > With it
> >
> > Uh. Not sure I see right off why the logarithmic time accumulation
> > would give you troubles. Its actually there to try to fix a couple of
> > NTP issues that cropped up when the accumulation interval was pushed
> > out to 2HZ with CONFIG_NO_HZ.
>
> Yes, I know. I guess (based on the commit log and other sources) that
> this change was added to make chrony work better on tickless kernel.
> So chrony corrects the time using only frequency corrections?

I'm not super familiar with chrony, but from talking with folks who work
on it, it doesn't use the doesn't use the kernel pll, and instead uses
old oneshot mode.

So yea, the logarithmic accumulation did help chrony as well as other
users of adjtimex that expected to be able to make HZ granular
adjustments.


> My approach is different: use time_offset t remove the phase error and
> adjust time_freq to remove frequency error.

This sounds much closer to how NTP does it.
So are you using the kernel PLL? or just singleshot for the offset?


> > Do you have any extra insight here as to whats going on with your
> > code? The only thing I could guess would be second_overflow() is
> > happening closer to the actual overflow, but maybe less regularly? But
> > again, I'm not sure how this would be drastically different then
> > before with the 2HZ accumulation period.
>
> I still can't find this out (partially because I'm too busy with other
> tasks). The new code seems ok to mee. time_offset is added at
> second_overflow as usual. Maybe the problem is with the frequency
> correction. I'm going to run some tests that should show where the
> problem is: in the phase or freq correction.

Yea, you can play with things on the kernel side by setting
NTP_INTERVAL_FREQ to 2 in include/linux/timex.h to move back to the
coarser granularity (while preserving the logarithmic accumulation).

Do let me know if you find anything here.

thanks
-john

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alexander Gordeev on
В Mon, 22 Mar 2010 14:01:50 -0700
john stultz <johnstul(a)us.ibm.com> пишет:

> On Mon, 2010-03-22 at 23:42 +0300, Alexander Gordeev wrote:
> > Hi John,
> >
> > Sorry for the delay...
> >
> > В Mon, 8 Mar 2010 19:25:07 -0800
> > john stultz <johnstul(a)us.ibm.com> пишет:
> >
> > > On Wed, Feb 24, 2010 at 4:28 AM, Alexander Gordeev
> > > <lasaine(a)lvk.cs.msu.su> wrote:
> > > > This patchset is tested against the vanilla 2.6.32.9 kernel.
> > > > But we are actually using it on 2.6.31.12-rt20 rt-preempt
> > > > kernel most of the time. Also there is a version which should
> > > > be applied on top of LinuxPPS out of tree patches (i.e. all
> > > > clients and low-level irq timestamps stuff). Those who are
> > > > interested in other versions of the patchset can find them in
> > > > my git repository:
> > > > http://lvk.cs.msu.su/~lasaine/timesync/linux-2.6-timesync.git
> > > >
> > > > There is one problem however: hardpps() works bad when used on
> > > > top of 2.6.33-rc* with CONFIG_NO_HZ enabled. The reason for
> > > > this is commit a092ff0f90cae22b2ac8028ecd2c6f6c1a9e4601.
> > > > Without it hardpps() is able to sync to 1us precision in about
> > > > 10 seconds. With it
> > >
> > > Uh. Not sure I see right off why the logarithmic time accumulation
> > > would give you troubles. Its actually there to try to fix a
> > > couple of NTP issues that cropped up when the accumulation
> > > interval was pushed out to 2HZ with CONFIG_NO_HZ.
> >
> > Yes, I know. I guess (based on the commit log and other sources)
> > that this change was added to make chrony work better on tickless
> > kernel. So chrony corrects the time using only frequency
> > corrections?
>
> I'm not super familiar with chrony, but from talking with folks who
> work on it, it doesn't use the doesn't use the kernel pll, and
> instead uses old oneshot mode.
>
> So yea, the logarithmic accumulation did help chrony as well as other
> users of adjtimex that expected to be able to make HZ granular
> adjustments.
>
>
> > My approach is different: use time_offset t remove the phase error
> > and adjust time_freq to remove frequency error.
>
> This sounds much closer to how NTP does it.
> So are you using the kernel PLL? or just singleshot for the offset?

It's very close a singleshot adjustment, but it uses time_offset instead
of time_adjust. This code does the trick:

static inline s64 ntp_offset_chunk(s64 offset)
{
if (time_status & STA_PPSTIME && time_status & STA_PPSSIGNAL)
return offset;
else
return shift_right(offset, SHIFT_PLL + time_constant);
}

So when kernel consumer is active, the whole time_offset is applied
immediately.

> > > Do you have any extra insight here as to whats going on with your
> > > code? The only thing I could guess would be second_overflow() is
> > > happening closer to the actual overflow, but maybe less
> > > regularly? But again, I'm not sure how this would be drastically
> > > different then before with the 2HZ accumulation period.
> >
> > I still can't find this out (partially because I'm too busy with
> > other tasks). The new code seems ok to mee. time_offset is added at
> > second_overflow as usual. Maybe the problem is with the frequency
> > correction. I'm going to run some tests that should show where the
> > problem is: in the phase or freq correction.
>
> Yea, you can play with things on the kernel side by setting
> NTP_INTERVAL_FREQ to 2 in include/linux/timex.h to move back to the
> coarser granularity (while preserving the logarithmic accumulation).

Great, thanks for the point! Didn't think of it.

> Do let me know if you find anything here.

Sure!

--
Alexander