From: kenney on
In article
<0db80478-326d-4b55-b6bd-33d75a811166(a)36g2000yqu.googlegroups.com>,
robertwessel2(a)yahoo.com () wrote:

>
> running on separate cores can't tell that the order of time values
> stored is actually slightly out of sync across the machine or
> cluster.

However nowadays there are external time sources that are accurate to
milliseconds and guaranteed to be unique. A trivial example of their use
is the self adjusting radio clock. I doubt that implementating the use
of the time signal would be easier than anything suggested so far but
each cluster could have it's own time source with no synchronisation
problems.

Ken Young
From: Morten Reistad on
In article <rfKdncKLWbTF2sXWnZ2dnUVZ7qydnZ2d(a)giganews.com>,
<kenney(a)cix.compulink.co.uk> wrote:
>In article
><0db80478-326d-4b55-b6bd-33d75a811166(a)36g2000yqu.googlegroups.com>,
>robertwessel2(a)yahoo.com () wrote:
>
>>
>> running on separate cores can't tell that the order of time values
>> stored is actually slightly out of sync across the machine or
>> cluster.
>
> However nowadays there are external time sources that are accurate to
>milliseconds and guaranteed to be unique. A trivial example of their use
>is the self adjusting radio clock. I doubt that implementating the use
>of the time signal would be easier than anything suggested so far but
>each cluster could have it's own time source with no synchronisation
>problems.

And you can synchronise clocks on different processors by having
simple counters incremented by pulses on a wire, from a coherent
source; just see to it that the delay on the wire and electronics
is stable, and the wires are all the same length; and that a timebase
for the counters can be established.

Then you tweak the speed of the source with ntp and adjtime-like
behaviour. Such a counter should be able to run at about a quarter
of the basic switching speed; way faster than any instructions or
memory access.

Reading it at L2 cache speeds should not be a problem, either.

This is how the telco's synchronised public clocks half a century
ago.

-- mrr


From: MitchAlsup on
After reading this thread several times, it seems that the timer one
is looking for has several properties:

A: can be read at least a billion times per second uniformly over a
whole system of thousands of nodes
B: always returns a unique number--this number related to time in some
way
C: this number is ultimatey used to determine order (i.e
synchronization winners and loosers)
D: uses all the fast access pathways in the system (i.e. cache
hierarchy)
E: but never uses any of the slow parts of the system (i.e. cache
coherence mechanism, OS-calls)
F: leverages off of fast access techniques (user mode instructions,
TLB)
G: which is safe, secure, fast, and <blah blah>

This reminds me of what the physicists were probably talking about
just after the turn of the previous century between the discovery of
the photoelectric effect and the development of quantum mechanics.

Mitch
From: Terje Mathisen "terje.mathisen at on
kenney(a)cix.compulink.co.uk wrote:
> In article
> <0db80478-326d-4b55-b6bd-33d75a811166(a)36g2000yqu.googlegroups.com>,
> robertwessel2(a)yahoo.com () wrote:
>
>>
>> running on separate cores can't tell that the order of time values
>> stored is actually slightly out of sync across the machine or
>> cluster.
>
> However nowadays there are external time sources that are accurate to
> milliseconds and guaranteed to be unique. A trivial example of their use

The canonical "cheap but accurate" time source these days is a Garmin
GPS18LVC: Together with an RS232 DB9 connector and a USB cable you have
all the hw needed for a ~1us timing reference, at a total cost of around
$60-80, plus half an hour's work.

> is the self adjusting radio clock. I doubt that implementating the use
> of the time signal would be easier than anything suggested so far but
> each cluster could have it's own time source with no synchronisation
> problems.

Right, see above. :-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
From: Mayan Moudgill on
Andy "Krazy" Glew wrote:

> Tim McCaffrey wrote:
>
>> In article <4B540900.4060107(a)patten-glew.net>, ag-news(a)patten-glew.net
>> says...
>>
>>> I wrote th following for my wiki,
>>> http://semipublic.comp-arch.net/wiki/SYSENTER/SYSEXIT_vs._SYSCALL/SYSRET
>>> and thought thgat USEnet comp.arch might be interested:
>>>
>>>
>>>
>>>

Sorry to jump in late.

One reason a process needs to cross protection/privilege domains is
because it needs to execute an instruction sequence that is completely
safe, but contains instructions that, in isolation, are unsafe, and
therefore are unavailable in the processes original domain.

A somewhat contrived example: assume that, in a multi-threaded
processor, there is a register that controls an executing thread's
priority. Since we don't want to allow threads to willy-nilly grab 100%
of the CPU resources, writes to that register are privileged. However,
lowering your own thread-priority is a safe operation. So, the function
void
decrement_thread_priority(int n)
{
int x = read_thread_priority();
if( n > x ) {
x = 0;
}
else {
x -= n;
}
write_thread_priority(x);
}
is a completely safe operation. Unfortunately, because it contains a
privileged operation, a process must somehow change its priority before
executing the operation.

Now, one way to do this is to associate privileges with code-pages; so,
if the CPU is executing a privileged operation and the page has
EXECUTE-PRIVILEGED-CODE bit set, then its ok to execute the instruction.

This "solution" suffers from the hole that a malicious process could
branch to the middle of the protected code sequence. So, we have to
guarantee that protection transitions only occur at the start (and end)
of safe code fragments. Thus, we must have an operation that
simultaneously changes privilege level AND instruction pointer.

One possible solution is to have the privilege change operation always
branch to the same address, and pass it the "address" of the function to
be executed; this would be equivalent to saying:
execute_at_elevated_priority(decrement_thread_priority, N);

void
execute_at_elevated_priority( (void (*fn)(int)), int arg)
{
if( safe_to_execute(fn)) {
fn(arg)
}
}
The initial privilege escalation+branch can be done by SYSCALLs or a
software generated interrupt.

This scheme can be extended to have multiple fixed entry points, by
having a parameter to the SYSCALL-equivalent or providing for multiple
interrupts.

(Going back to the privileged-code-page approach) Alternatively, we
could guarantee that every instruction on a page was the start of a safe
code sequence. This could be done trivially by having each of the
instructions be branches to the actual function. But then the function
body itself would still need to be guarded somehow. A possible solution
would be to have instructions that are available only in privilege mode,
but having a page protect mode such that, if an instruction from that
page is executed, the privilege level of the executing process is
escalated. So, the page will have a EXECUTE-AND-CHANGE-PRIVILEGE bit
set, if any instruction from that page is executed the privilege of the
process is increased, and that page contains branches to the actual
functions.

Of course, the hardware could simply treat such a page as a vector of
pointers, and the branch-and-change-privilege picks an instruction from
the page to branch to.

(Going back to the privileged-code-page approach) Another alternative is
to allow entry to pages with EXECUTE-PRIVILEGED-CODE only at the
beginning of the page. This has the drawback of requiring an entire page
to be devoted to what might be a small function, which is not a big deal
on a desktop processor; there may be a performance penalty associated
with the additional TLB entries.