From: Owen Shepherd on
I was reading about the IBM SIE instruction on Andy's wiki (Article at
http://semipublic.comp-arch.net/wiki/SIE ), and a few things struck me.

The first is the question of "If I'm implementing this efficient virtual
machine system, do I really need a separate 'user mode'?". The second is
"How can this instruction be made to fit better with the general design of a
RISC architecture?"

An important consideration is also how to integrate the instruction with
some typical RISC performance enhancing features - a key one being tagged
address spaces - which may require kernel support. Of course, this is a non-
issue if we are implementing our user mode as one of these virtual machines,
which is something that I would be inclined to do.

The other question is that of the actual interface that software uses to
interact with it - while one would likely make it register based (rather
than the memory "SIEBK"), but how would you define these? This question is
particularly relevant in provisioning it to support the addition of further
hardware assists in future.
From: Anne & Lynn Wheeler on

Owen Shepherd <owen.shepherd(a)e43.eu> writes:
> I was reading about the IBM SIE instruction on Andy's wiki (Article at
> http://semipublic.comp-arch.net/wiki/SIE ), and a few things struck me.
>
> The first is the question of "If I'm implementing this efficient virtual
> machine system, do I really need a separate 'user mode'?". The second is
> "How can this instruction be made to fit better with the general design of a
> RISC architecture?"
>
> An important consideration is also how to integrate the instruction with
> some typical RISC performance enhancing features - a key one being tagged
> address spaces - which may require kernel support. Of course, this is a non-
> issue if we are implementing our user mode as one of these virtual machines,
> which is something that I would be inclined to do.
>
> The other question is that of the actual interface that software uses to
> interact with it - while one would likely make it register based (rather
> than the memory "SIEBK"), but how would you define these? This question is
> particularly relevant in provisioning it to support the addition of further
> hardware assists in future.

basic 360/mainframe had two modes ... "problem mode" (non-privilege) and
supervisor state. bunch of instructions were invalid in supervisor
state.

original 360 virtual machine implementation ran virtual machine in
"problem mode" ... and all supervisor state instructions interrupted
into the hypervisor kernel ... and were simulated (according to virtual
machine "rules").

virtual machine assist started out as a special flag that basically
implemented two modes for (some) supervisor state instructions ...
"real machine" mode ... and "virtual machine" mode; virtual machine mode
basically was what the hypervisor kernel would have done in simulation.
over a period of a decade or so ... more & more supervisor state
instructions were added to being done in virtual machine mode ...
directly by hardware (w/o requiring interrupts into the hypervisor
kernel and emulation by the software).

the original SIE implementation was on 3081 and a "heavyweight"
instruction ... the 3081 having limited microcode storage ... so the
initial execution of the SIE instruction (for entering virtual machine
mode) was actually on a disk and had to be "paged" into microcode memory
for execution.

Next generation 3090 ... had SIE instruction better integrated into the
native hardware and had significantly better performance. old email
discussing some differences between 3090 & 3081 (including SIE
implementation)
http://www.garlic.com/~lynn/2003j.html#email831118
in this post
http://www.garlic.com/~lynn/2003j.html#42 Flash 10208

In the 3090 timeframe ... one of the mainframe clone vendors introduced
"hypervisor" ... which was a subset of virtual machine capability
implemented totally in the "hardware" ... eliminating the requirement
for a separate virtual machine operating system (as long all that was
needed was the hypervisor subset).

the response on the 3090 was "PR/SM" ... which was virtual machine
subset ... implemented in the "hardware" ... and not requiring a
separate virtual machine operating system (to partition the machine) as
long as all that was needed was the subset function. PR/SM leveraged the
SIE capabilty and originally SIE hardware function wasn't recursive
.... aka a virtual machine operating system running in a virtual machine
.... wouldn't have SIE capability being performed by the real
hardware. Since PR/SM was using SIE instruction ... a virtual machine
operating system running under PR/SM wouldn't have SIE available.

PR/SM evolved into LPARs (logical partitions) and running SIE under SIE
support was added (since LPARs leveraged the internal SIE implementation
.... it was necessary to add some recursive capability to allow a virtual
machine operating system, running in an LPAR, to use SIE).

some PR/SM references
http://publib.boulder.ibm.com/infocenter/eserver/v1r2/topic/eicaz/eicazzlpar.htm
http://www-01.ibm.com/support/docview.wss?uid=isg209611e17c3b8d419852573f700645d4d&aid=1

some LPAR references (includes comment that PR/SM and LPAR terms
sometimes being used interchangeably).
http://en.wikipedia.org/wiki/LPAR
http://www.ibmsystemsmag.com/mainframe/administrator/9917p1.aspx

some discussion regarding running virtual machine operating system in
LPAR and interaction between the virtual machine operating system and
PR/SM function.
http://www.vm.ibm.com/perf/tips/lparinfo.html

this mentions some limitation on SIE recursion (since SIE feature
is also being used by hardware LPAR feature/function)
http://www.vm.ibm.com/perf/tips/z890.html

--
virtualization experience starting Jan1968, online at home since Mar1970
From: Anne & Lynn Wheeler on

re:
http://www.garlic.com/~lynn/2010k.html#71 "SIE" on a RISC architecture

actually ... this email goes into some more of the SIE differences
between 3081 & 3090 (trout)
http://www.garlic.com/~lynn/2006j.html#email810630
in this post
http://www.garlic.com/~lynn/2006j.html#27 virtual memory

above email mentions VMTOOL & VM/811

after failure of future system ... there was mad rush to get
products back into the 370 product pipeline ... some past posts
mentioning future system
http://www.garlic.com/~lynn/submain.html#futuresys

in parallel with launching 370 followon effort 370/xa ... first cut at
the 370/xa (31-bit addressing some number of other things) specification
& architecture documents were all dated nov78 ... giving rise to the
"811" reference (which was going to take 7-8 yrs for both software and
hardware and first ship with 3081 ... starting prior to nov78 document
publications).

in the aftermath of the future system failure ... the favorite son
operating system in POK managed to make the case that the virtual
machine product needed to be killed and all the people transferred to
POK to support the development of the "XA" version of that operating
system (aka mid-70s). Part of that effort was the "VMTOOL" ... which
originally was going to be internal only (providing 811 virtual machines
for internal product development).

eventually there was a group that managed to pickup the virtual machine
product mission ... but they had to reconstitute a new development
effectively from scratch.

--
virtualization experience starting Jan1968, online at home since Mar1970
From: MitchAlsup on
On Jul 9, 2:27 pm, Owen Shepherd <owen.sheph...(a)e43.eu> wrote:
> I was reading about the IBM SIE instruction on Andy's wiki (Article athttp://semipublic.comp-arch.net/wiki/SIE), and a few things struck me.
>
> The first is the question of "If I'm implementing this efficient virtual
> machine system, do I really need a separate 'user mode'?". The second is
> "How can this instruction be made to fit better with the general design of a
> RISC architecture?"

To the first question, you are going to want indepent TLB tanslation
storeage for User, Supervisor, and Hypervisor translations. This lets
faulting instructions be efficiently emulated without wiping the TLB
clean. So, at least somewhere you are going to want three states. Now,
if you have three states in the TLB, why not elsewhere.....

What you are trying to do with virtualization is to be able to run the
OS on hardware the OS does not necessarily understand. Thus a layer of
SW between the supervisor instructions and the hardware at hand is
required. That is, the hypervisor is attempting to present the
illusion of hardware that is not really there; right down to the
device registers of the keyborad controller (if necessary).

> An important consideration is also how to integrate the instruction with
> some typical RISC performance enhancing features - a key one being tagged
> address spaces - which may require kernel support. Of course, this is a non-
> issue if we are implementing our user mode as one of these virtual machines,
> which is something that I would be inclined to do.

The hypervisor must be allowed to give the illusion that the kernel is
in control of hardware that is not necessarily existing. If that HW
that does not exist supports tagged memory management, the the
hypervisor SW must create this illusion.

The easiest way to provide this illusion is for anything that could
potentially gain visibility on the illusion must trap to the
hypervisor. (Nanosecond scale hardware timers are hard under this
definition).

The best way to provide this illusion (a majority of the time) is to
rewrite parts of the OS as to be Hypervisor-aware and call the
hypervisor rather than try to do the low level device accesses
directly and take traps at every third instruction.

> The other question is that of the actual interface that software uses to
> interact with it - while one would likely make it register based (rather
> than the memory "SIEBK"), but how would you define these? This question is
> particularly relevant in provisioning it to support the addition of further
> hardware assists in future.

Application SW should have knowledge of its register state, and its
memory state; and the instruction set it is allowed to execute from--
and as little else as possible. OS SW should be aware that it is being
virtualized and efficiently access the hardware through hypervisor
access methods. The hypervisor has no illusions, but provides
illusions to others.

Mitch
From: Andy Glew on
On 7/9/2010 4:48 PM, MitchAlsup wrote:
> On Jul 9, 2:27 pm, Owen Shepherd<owen.sheph...(a)e43.eu> wrote:
>> I was reading about the IBM SIE instruction on Andy's wiki (Article athttp://semipublic.comp-arch.net/wiki/SIE), and a few things struck me.
>>
>> The first is the question of "If I'm implementing this efficient virtual
>> machine system, do I really need a separate 'user mode'?". The second is
>> "How can this instruction be made to fit better with the general design of a
>> RISC architecture?"
>
> To the first question, you are going to want indepent TLB tanslation
> storeage for User, Supervisor, and Hypervisor translations. This lets
> faulting instructions be efficiently emulated without wiping the TLB
> clean. So, at least somewhere you are going to want three states. Now,
> if you have three states in the TLB, why not elsewhere.....

By the way, one of the coolest things about SIE is whenyou realise that
it can implement the virtual machine monitor in user space.

So, not three states: User, Supervisor, Hypervisor.

But four: Guest-User, Guest-Kernel, Host-User, Host-Kernel.

Now, you can always skip the Host-User and go straight to Host-Kernel.
And I must admit that I had a lot of trouble convincing people that
userr mode VMMs were a good idea.

But they are. Any OS person, any security person, should realize this.

Heck: I am renting a virtual machine as a cloud server. I sure would
like to be able to run a virtual machine inside that. But, further -
originally I was only renting shared server time - not a fully virtual
machine. Nevertheless, I would like to have been able to run a virtual
machine from my user account n the shared hosting site.