From: "Andy "Krazy" Glew" on
EricP wrote:
> Andy "Krazy" Glew wrote:
>> Note that x86 eventually got around to adding READ_EIP instruction.
>
> Where is that? I find no reference to such an instruction.

Perhaps I should have said x86-64. And perhaps I have slipped a bit,
wishful thinking and all that, but does not LEA with a RIP-relative
addressing mode do what you want?

--

I must admit that I have slightly mixed feelings about PIC. Sure, it's
a good idea to be able to relocate code. But that is PIC code
addressing. I am not so sure that it is a good idea to encourage data
to be at a fixed offset from the code. Perhaps for constants.

Also, as somebody who has had to deal with security issues: PIC is a
gift to malware. After all, one of the basic characteristics of binary
code injections via buffer overflows is that they are at an unknown
address. PIC makes it easier to write viruses. Although at the same
time it makes it easier to randomize the address space, and thereby make
it harder to write viruses. Fortunately, x86-64 has other good features
that, when correctly employed, can hinder malware. And, fortunately,
x86-64 breaks the need for legacy compatibility, affording the
opportunity to

I suspect that it is better overall to use one or more base registers
for data addresses. Rather than relying on RIP, the instruction
pointer, as a free base register. But then that requires at least one
dedicated register, and even with REX x86-64 doesn't really have enough
registers.

I sometimes think that we should have RIP-relative branching and control
flow, and RIP-relative loading of constants. But that we should
discourage writing to RIP relative data locations. E.g. by disallowing
it in the store addressing modes. So long as you can do a RIP relative
LEA, you can always get RIP relative stores if you want.

From: Gavin Scott on
"Andy \"Krazy\" Glew" <ag-news(a)patten-glew.net> wrote:
> Thanks for the pointer to HP MPE/XL. With your permission, I may
> include a note about this on my website.

Of course.

> By the way, HP PA RISC had a good protected entry point mechanism - IIRC
> called "gateways". Did MPE/XL make use of these?

Yes, apart from Return From Interrupt I think this is the only way to
do privilege promotion on PA-RISC.

G.
From: robertwessel2 on
On Jan 18, 8:44 pm, "Andy \"Krazy\" Glew" <ag-n...(a)patten-glew.net>
wrote:
> Del Cecchi wrote:
> > "Noob" <r...(a)127.0.0.1> wrote in message
> >news:hj1pf6$n14$1(a)speranza.aioe.org...
> >> Terje Mathisen wrote:
>
> >>> Anton Ertl wrote:
>
> >>>> Andy Glew wrote:
>
> >>>>> I still think that both Intel and AMD missed a big opportunity,
> >>>>> to make system calls truly as fast as function calls.
> > An interesting alternative to study would be the IBM S/38, AS400,
> > i-series family.  As I understand it most work was done by functions
> > or commands that were imbedded in the OS.  Unfortunately I have now
> > exhausted my knowledge of said operating system.
>
> > It just seemed that since this system came from a totally different
> > heritage than x86 or linux or any of the rest that it might be
> > informative.  I don't know if any of the relevant stuff is publicly
> > available however.   I might take a browse through the ibm
> > publications and see.  It used to be pretty well guarded.
>
> I agree with Del: the IBM S/38, AS400, i-Series family is very
> interesting. It is the only really successful capability machine.  With
> proper capabilities.
>
> You can only execute code that you have been granted a capability /
> secure pointer to, on data that you have a capability / secure pointer to..
>
> Obviously it could run C++, since it was one of the first machines to
> have system software largely written in C++.  Or at least according to
> "Inside the IBM AS/400".


The S/38 and the early AS/400s were based on proprietary CISC
processors (I think there were either three or four generations, with
substantial differences). One of the interesting things about the S/
38 and (original) AS/400 was that all applications compiled to a byte
code (called MI), which largely hid the repeated changes in the actual
ISA of the CPUs.

The later AS/400s (and then iSeries, Series i, or whatever it’s being
called this week) were implemented on POWER/PPC, and the internals of
that version of the OS was one of the first written in C++. Again MI
made the essentially complete OS rewrite and ISA change invisible to
most applications. The PPC based iSeries have been slowly merging (in
terms of hardware) with the pSeries line, and the boxes are now
basically identical, and now you can run OS/400, AIX and Linux on both
sets of boxes (with certain restrictions). And given POWER's strong
partitioning capabilities, you often see more than one OS.


> Its capabilities were implemented non-forgeable secure pointers.
> Unfortunately, they used an extra tag bit on memory locations, stolen
> from ECC.  Good for them, unfortunate for the PC industry, where even
> today most machines do not have ECC.
>
> I do not know whether the secure pointers were used for all function
> calls, or just calls to system software.  If used for all function
> calls, I suspect that they may have had a performance penalty compared
> to what could have been done, if no secure pointers were used.  I
> suspect that IBM Rochester deserves credit for being willing to take
> this step, and make it run as fast as possible.
>
> I must admit that I was disappointed to learn that late in its lifetime
> this system added a classic UNIX user/kernel mode. Supposedly to
> improve security, as well as run UNIX.


They do have a performance penalty. The S/38 was designed for
traditional monolithic languages like RPG and Cobol, using large
amounts of system calls to do a lot of I/O (particularly database I/O,
which was part of the kernel), and very little in the way of more
modern subroutine calls. Early implementations of C on the AS/400
were serious dogs. Not to mention that those had more than a few
DS9000 inspired attributes (really big pointers - 16B, non linear
address space, EBCDIC, linking oddities). OS/400 (now "IBM i" - what
genius thought that up? at least the prior "i5/OS" was reasonably
pronounceable), has grown considerably more C friendly over the years,
and in that support has managed to lose some of the protections
(although things are still pretty strong).

These systems also have good Java support and performance, since they
execute the JVM and JIT compiler on the native (PPC) hardware rather
than in MI.

But it’s certainly an interesting system.

From: Terje Mathisen "terje.mathisen at on
Andy "Krazy" Glew wrote:
> Also, as somebody who has had to deal with security issues: PIC is a
> gift to malware. After all, one of the basic characteristics of binary
> code injections via buffer overflows is that they are at an unknown
> address. PIC makes it easier to write viruses. Although at the same time

For a virus/worm the fact that you need cludges like CALL/POP reg to
generate a code-relative base pointer, and that this makes the code
slower due to messing up any return stack cache doesn't really matter:

Bad performance for code that only runs once, and mostly in malware, is
fine with me. :-)

> I sometimes think that we should have RIP-relative branching and control
> flow, and RIP-relative loading of constants. But that we should

Exactly!

> discourage writing to RIP relative data locations. E.g. by disallowing
> it in the store addressing modes. So long as you can do a RIP relative
> LEA, you can always get RIP relative stores if you want.

In the old days, with separate code (CS:) and data (DS:, ES:) this was a
very natural model:

Loading constants, like branch offsets, from the code segment was fine,
particularly since they were very likely loaded into cache by the code
prefetcher (on a 486 with a single unified cache!).

I.e.

and bx, NOT 7 ; 8 possible computed jump targets
add bx,bx
jmp cs:jumptable[bx]
jumptable:
dw jmp0, jmp1, jmp2 ...

Doing the same in a PIC style was only slightly more complicated...

and bx, NOT 7
add bx,bx
call dummy
dummy:
pop si
jmp [cs:si+bx +jumptable-dummy]
jumptable:
dw jmp0, jmp1, jmp2 ...

but you would probably do the SI base pointer load only once, then keep
it around.

Terje
--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
From: nmm1 on
In article <4B551EA9.7010409(a)patten-glew.net>,
Andy \"Krazy\" Glew <ag-news(a)patten-glew.net> wrote:
>Gavin Scott wrote:
>
>>> System calls are really just function calls. With security. They have
>>> to switch stacks, etc.
>>
>> Well, except when they don't. I know of one significant, successful OS
>> that really didn't have a kernel stack at all, and executed pretty much
>> everything except hardware interrupt context on top of the user's own
>> stack.
>>
>> This was MPE/XL (later MPE/iX) on PA-RISC, HP's operating system for
>> their HP-3000 systems.
>...
>> Now any modern security architect would probably run screaming at
>> this point, and there definitely were challenges in this area
>> (user asynchronous unprivileged trap/event handlers being a rather
>> obvious one), but I'm not aware of any dramatic failures resulting
>> from this design. On the other hand I don't think anyone would be
>> likely to make such design choices again in today's world.

I can't think of a mainstream programming language that supports
asynchronous event handlers (i.e. user ones) - yes, I know that
several HAVE them, but they don't SUPPORT them :-( Merely invoking
one is undefined behaviour in several different ways in C and its
derivations, for example.

>Another problem would be multithreaded code. What if other untrusted
>user threads were executed at the same time on other processors,
>accessing this stack?
>
>As a student intern once said: "shared memory = security hole".

In a sane model, each thread's stack is private - which means that
other threads can't access it. This isn't the only problem with
the POSIX 'model'.


Regards,
Nick Maclaren.