From: "Andy "Krazy" Glew" on
Noob wrote:
> Andy Glew wrote:
>
>> I wrote the following for my wiki,
>> http://semipublic.comp-arch.net/wiki/SYSENTER/SYSEXIT_vs._SYSCALL/SYSRET
>> and thought that USEnet comp.arch might be interested
>
> This old post by Linus Torvalds seems somewhat relevant.
> http://lkml.org/lkml/2002/12/18/218

Looks like Linus figured out what I intended.

Would have been easier if I had been allowed to talk to him, back then.

From: Jeremy Linton on
On 1/18/2010 3:59 AM, Anton Ertl wrote:
> "Andy \"Krazy\" Glew"<ag-news(a)patten-glew.net> writes:
>> I still think that both Intel and AMD missed a big opportunity, to make
>> system calls truly
>> as fast as function calls. Chicken and egg.
>
> But given that system calls have to do much more sanity checking on
> their arguments, and there is the common prelude that you mentioned
> (what is it for?), I don't see system calls ever becoming as fast as
> function calls, even with fast system call and system return
> instructions.

My experience (working on one of the commercial unix's) was that having
a fast sysenter type instruction had the opposite effect and resulted in
a fairly slow system call interface.

That's because there was always demand to have some specific
functionality faster than the general case. So, the syscall handler was
chuck full of special case checking for one function or the other. Plus,
since the main OS was written in C, it ended up saving all kinds "extra"
context anyway. In the end, IIRC there were 150-200 instructions on the
kernel side before it started processing a "normal" system call.

Sure, one or two "critical" calls used in benchmarks were faster, but
the whole system suffered. I was never convinced that a few percent
faster for a few special cases, outweighed the few percent slower for
everything else in the system.




From: EricP on
Andy "Krazy" Glew wrote:

> <big snip>

From a usability point of view, when I was toying with this
I found there to be 2 problems with SysEnter/SysExit.

Firstly, and most critically, SysExit does not load the
EFLAGS register, specifically the interrupt flag.
This was a problem because I needed a small non-interruptible
system service return sequence during the transition to test
for user mode software interrupt delivery in a lossless manner.
I wanted to disable interrupts, check a boolean, and return to
user mode if nothing pending, with the interrupts being re-enabled
by the return. This was a show stopper and made it unusable.

Secondly, the problem with SysEnter is that it assumes that
the EDX will be preloaded with the restart EIP but the x86
provides no easy method load load an arbitrary offset of the
current EIP into EDX except that kludgey call +0, pop edx method.
So to use SysEnter you have to preload EDX with a constant restart
EIP and that presumes the entry sequence is at a predefined
location and that limits the utility of the SysEnter somewhat.

The position dependent code method:

push ecx
push edx
mov esp, ecx // Save stack pointer
mov eax, 123 // System service routine number
mov edx, #RestartAddr // Constant restart address
sysenter
RestartAddr:
pop edx
pop ecx

The position independent code method:

push ecx
push edx
mov esp, ecx // Save stack pointer
mov eax, 123 // System service routine number
call +0 // Load restart address of pop edx
pop edx
add edx, 6
sysenter
pop edx
pop ecx

What would have been nice is if there was an instruction to move
EIP to a general register and add a constant offset at the same time.

push ecx
push edx
mov esp, ecx // Save stack pointer
mov eax, 123 // System service routine number
mov edx, eip+4 // Load restart address of pop edx
sysenter
pop edx
pop ecx

Eric




From: Terje Mathisen "terje.mathisen at on
EricP wrote:
> What would have been nice is if there was an instruction to move
> EIP to a general register and add a constant offset at the same time.
>
> push ecx
> push edx
> mov esp, ecx // Save stack pointer
> mov eax, 123 // System service routine number
> mov edx, eip+4 // Load restart address of pop edx
> sysenter
> pop edx
> pop ecx

Didn't DEC use to have a patent on IP-relative addressing, specifically
to make PIC much easier?

It must be more than 17 years ago at least!

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
From: Gavin Scott on
"Andy \"Krazy\" Glew" <ag-news(a)patten-glew.net> wrote:
> System calls are really just function calls. With security. They have
> to switch stacks, etc.

Well, except when they don't. I know of one significant, successful OS
that really didn't have a kernel stack at all, and executed pretty much
everything except hardware interrupt context on top of the user's own
stack.

This was MPE/XL (later MPE/iX) on PA-RISC, HP's operating system for
their HP-3000 systems.

The page-level protection mechanisms and explicit long addressing
modes were used to produce a system with an effectively flat single
address space in which any process could form and dereference any
possible memory address. This mostly eliminated the distinction
between user and kernel code. Some functions were in privileged
libraries that were flagged to cause privilege promotion, but a
call to a system library function encountered no fixed overhead
relative to an ordinary call to an unprivileged user library.

Each privileged function was required to enforce system security
policy and could do so in any way that it liked rather than being
forced to sanity-check every parameter before it was known that
it would be dereferenced for example. And no complicated copy-in/out
to deal with moving things in and out of "kernel" space, etc.

Now any modern security architect would probably run screaming at
this point, and there definitely were challenges in this area
(user asynchronous unprivileged trap/event handlers being a rather
obvious one), but I'm not aware of any dramatic failures resulting
from this design. On the other hand I don't think anyone would be
likely to make such design choices again in today's world.

But the resulting system was a relative joy to use, develop for
(both user and system-level code), and definitely easier to debug.

G.