From: Paul A. Clayton on
On Jan 21, 8:10 pm, ga...(a)allegro.com (Gavin Scott) wrote:
> Robert A Duff <bobd...(a)shell01.theworld.com> wrote:
>
> > Why does it have to be a whole page?  Maybe you could say that if
> > you call/jump to such a page, the address has to be a multiple
> > of (say) 64 bytes.  Otherwise, the hardware traps.  So you can
> > put a bunch of 64-byte privileged procedures on each such page.

This reminds me of Google's Native Client--all entry points are 32B
aligned and certain instructions are not allowed by the runtime
(scanning for safety can be relatively fast with the alignment and
other restrictins).

> PA-RISC has an even more flexible form of this in that you can
> associate a privilege promotion level with an executable page.
> The promotion does not happen when you branch to the page, but
> when you execute a PC relative branch instruction on that page
> that includes the ,GATE option, the target will execute at the
> set privilege level. So you can fill a page up with entry points
> that promote themselves. Promotion can only happen at the points
> of the GATE instructions, so branching into the middle of an
> instruction sequence won't let you do anything special.
>
> A page marked in this way is called a Gateway page, and this is
> the primary mechanism for privilege promotion in the architecture.

Itanium (sort of descended from PA-RISC) has a Enter Privileged Code
instruction. Jumping to it elevates privilege (if it is in a page
that allows for such escalation). (I dislike the fact that the
other instructions in a bundle containing EPC are executed in an
undefined privilege.) In theory, this could provide nearly
function call overhead for system calls (Itanium's shadow registers
might help speed such transitions for short operations).


Paul A. Clayton
just a technophile
From: "Andy "Krazy" Glew" on
Mayan Moudgill wrote:
> Andy "Krazy" Glew wrote:
>
>> Tim McCaffrey wrote:
>>
>>> In article <4B540900.4060107(a)patten-glew.net>,
>>> ag-news(a)patten-glew.net says...
>>>
>>>> I wrote th following for my wiki,
>>>> http://semipublic.comp-arch.net/wiki/SYSENTER/SYSEXIT_vs._SYSCALL/SYSRET
>>>>
>>>> and thought thgat USEnet comp.arch might be interested:
>>>>
>>>>
>>>>
>>>>
>
> Sorry to jump in late.
>
> One reason a process needs to cross protection/privilege domains is
> because it needs to execute an instruction sequence that is completely
> safe, but contains instructions that, in isolation, are unsafe, and
> therefore are unavailable in the processes original domain.


Mayan: I love this write up.

I've been meaning to post a blurb on my comp-arch.net wiki on something like this:
the placeholder was

http://semipublic.comp-arch.net/wiki/Why_not_RISC:_Atomicity,_Security,_Compatibility

As in: the 3 big reasons to create a microcoded, CISCy, instruction are
a) atomicity
b) security
c) compatibility

You can consider going into microcode to be crossing a privilege domain.

You show that similar considerations apply to syscalls.

And, lest anyone chime in, PALcode - which is somewhere between syscalls and microcode, IMHO more like microcode.

http://public.comp-arch.net/wiki/index.php?title=Why_not_RISC:_Atomicity,_Security,_Compatibility

--

Later on you talk about the need for protected entry points, if certain code pages have privilege.

I have a placeholder for that:
http://semipublic.comp-arch.net/wiki/Control_Flow_Transfers_Between_Security_Levels_and_Non-Ordered_Security_Domains

where I want to talk about things like HP's gateways, etc.
From: Terje Mathisen "terje.mathisen at on
Robert A Duff wrote:
> Mayan Moudgill<mayan(a)bestweb.net> writes:
>
>> (Going back to the privileged-code-page approach) Alternatively, we
>> could guarantee that every instruction on a page was the start of a safe
>> code sequence.
>
> What if somebody tries to jump into the middle of an instruction?
>
[snip]
>> (Going back to the privileged-code-page approach) Another alternative is
>> to allow entry to pages with EXECUTE-PRIVILEGED-CODE only at the
>> beginning of the page. This has the drawback of requiring an entire page
>> to be devoted to what might be a small function, which is not a big deal
>> on a desktop processor; there may be a performance penalty associated
>> with the additional TLB entries.
>
> Why does it have to be a whole page? Maybe you could say that if
> you call/jump to such a page, the address has to be a multiple
> of (say) 64 bytes. Otherwise, the hardware traps. So you can
> put a bunch of 64-byte privileged procedures on each such page.
>
> This answers my question about "middle of an instruction".

This is the solution selected by Google's "safe binary x86 plugins" for
their browser:

A subset of the x86 instruction set, with the additional requirement
that all possible call/jump/branch targets have to be 32 (or optionally
16) byte aligned.

This also means that any instruction that would straddle such a 32-byte
block will be preceded by NOPs, avoiding the possibility of jumping into
the middle of an opcode.

Self-modification/JIT is of course totally out of the question, and even
RET is disallowed: It has to be emulated with something like

POP EDX
ADD EDX,31
AND EDX,NOT 31
JMP EDX

which means that every CALL must be followed by NOP padding until the
next 32-byte boundary.

According to Google, these limitations makes it possible to statically
determine that the code to be loaded is "safe", meaning only that it
cannot do anything unsafe without calling out the OS.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
From: Bernd Paysan on
Terje Mathisen <"terje.mathisen at tmsw.no"> wrote:

> kenney(a)cix.compulink.co.uk wrote:
>> In article
>> <0db80478-326d-4b55-b6bd-33d75a811166(a)36g2000yqu.googlegroups.com>,
>> robertwessel2(a)yahoo.com () wrote:
>>
>>>
>>> running on separate cores can't tell that the order of time values
>>> stored is actually slightly out of sync across the machine or
>>> cluster.
>>
>> However nowadays there are external time sources that are accurate to
>> milliseconds and guaranteed to be unique. A trivial example of their use
>
> The canonical "cheap but accurate" time source these days is a Garmin
> GPS18LVC: Together with an RS232 DB9 connector and a USB cable you have
> all the hw needed for a ~1us timing reference, at a total cost of around
> $60-80, plus half an hour's work.

Are you serious? The serial cable alone may be capable of 1us precision
when talking to the GPS mouse (but it doesn't have to - 2.5us jitter for
115kbs is good enough), but converting from/to USB adds an indefinite delay.

So if you want to have a precise serial GPS mouse, use a real serial
interface, not something routed through USB.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/
From: Terje Mathisen "terje.mathisen at on
Bernd Paysan wrote:
> Terje Mathisen<"terje.mathisen at tmsw.no"> wrote:
>
>> kenney(a)cix.compulink.co.uk wrote:
>>> In article
>>> <0db80478-326d-4b55-b6bd-33d75a811166(a)36g2000yqu.googlegroups.com>,
>>> robertwessel2(a)yahoo.com () wrote:
>>>
>>>>
>>>> running on separate cores can't tell that the order of time values
>>>> stored is actually slightly out of sync across the machine or
>>>> cluster.
>>>
>>> However nowadays there are external time sources that are accurate to
>>> milliseconds and guaranteed to be unique. A trivial example of their use
>>
>> The canonical "cheap but accurate" time source these days is a Garmin
>> GPS18LVC: Together with an RS232 DB9 connector and a USB cable you have
>> all the hw needed for a ~1us timing reference, at a total cost of around
>> $60-80, plus half an hour's work.
>
> Are you serious? The serial cable alone may be capable of 1us precision
> when talking to the GPS mouse (but it doesn't have to - 2.5us jitter for
> 115kbs is good enough), but converting from/to USB adds an indefinite delay.
>
> So if you want to have a precise serial GPS mouse, use a real serial
> interface, not something routed through USB.

Oops, I was unclear!

The 18LVC is a pure serial GPS, I use the USB cable simply to supply +5V
power, the actual GPS signals are delivered via RX/TX/GND/DCD on the DB9
connector, with DCD used for the Pulse Per Second (PPS) signal from the GPS.

I have soldered together several GPS boards, including one of the
original 8-channel Motorola Oncore UT+ receivers (capable of ~35ns RMS).

My corporate NTP servers use the newer 12-channel version of the same
Oncore units.

Currently I have one of those 18LVC pucks on my roof top, connected to a
FreeBSD 8 box running on an old laptop in the attic.

It is reachable from the internet, but only via a dyndns address, which
means that it can change:

tmsw.dyndns.org:123

I have 30 Mbit/s symmetric fiber, so performance is OK, feel free to use
it if you're (network latency) nearby. :-)

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"