From: Avi Kivity on
On 07/27/2010 07:14 PM, Joerg Roedel wrote:
> This patch implements the emulations of the svm next_rip
> feature in the nested svm implementation in kvm.
>
> Signed-off-by: Joerg Roedel<joerg.roedel(a)amd.com>
> ---
> arch/x86/kvm/svm.c | 8 +++++++-
> 1 files changed, 7 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 7d10f2c..b44c9cc 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -1919,6 +1919,7 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
> nested_vmcb->control.exit_info_2 = vmcb->control.exit_info_2;
> nested_vmcb->control.exit_int_info = vmcb->control.exit_int_info;
> nested_vmcb->control.exit_int_info_err = vmcb->control.exit_int_info_err;
> + nested_vmcb->control.next_rip = vmcb->control.next_rip;
>

Can it be really this simple? Suppose we emulate a nested guest
instruction just before vmexit, doesn't that invalidate
vmcb->control.next_rip? Can that happen?

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Roedel, Joerg on
On Tue, Jul 27, 2010 at 02:32:35PM -0400, Avi Kivity wrote:
> On 07/27/2010 07:14 PM, Joerg Roedel wrote:
> > This patch implements the emulations of the svm next_rip
> > feature in the nested svm implementation in kvm.
> >
> > Signed-off-by: Joerg Roedel<joerg.roedel(a)amd.com>
> > ---
> > arch/x86/kvm/svm.c | 8 +++++++-
> > 1 files changed, 7 insertions(+), 1 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> > index 7d10f2c..b44c9cc 100644
> > --- a/arch/x86/kvm/svm.c
> > +++ b/arch/x86/kvm/svm.c
> > @@ -1919,6 +1919,7 @@ static int nested_svm_vmexit(struct vcpu_svm *svm)
> > nested_vmcb->control.exit_info_2 = vmcb->control.exit_info_2;
> > nested_vmcb->control.exit_int_info = vmcb->control.exit_int_info;
> > nested_vmcb->control.exit_int_info_err = vmcb->control.exit_int_info_err;
> > + nested_vmcb->control.next_rip = vmcb->control.next_rip;
> >
>
> Can it be really this simple? Suppose we emulate a nested guest
> instruction just before vmexit, doesn't that invalidate
> vmcb->control.next_rip? Can that happen?

Good point. I looked again into it. The documentation states:

The next sequential instruction pointer (nRIP) is saved in
the guest VMCB control area at location C8h on all #VMEXITs that
are due to instruction intercepts, as defined in Section 15.8 on
page 378, as well as MSR and IOIO intercepts and exceptions
caused by the INT3, INTO, and BOUND instructions. For all other
intercepts, nRIP is reset to zero.

There are a few intercepts that may need injection when running nested
immediatly after an instruction emulation on the host side:

INTR, NMI
#PF, #GP, ...

All these instructions do not provide a valid next_rip on #vmexit so we
should be save here. The other way around, copying back a next_rip
pointer when there should be none, should also not happen as far as I
see it. The next_rip is only set for instruction intercepts which are
either handled on the host side or reinjected directly into the L1
hypervisor.
When you don't see a failing case either, I think we are save with this
simple implementation.

Joerg

--
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on
On 07/28/2010 12:37 PM, Roedel, Joerg wrote:
>
>> Can it be really this simple? Suppose we emulate a nested guest
>> instruction just before vmexit, doesn't that invalidate
>> vmcb->control.next_rip? Can that happen?
> Good point. I looked again into it. The documentation states:
>
> The next sequential instruction pointer (nRIP) is saved in
> the guest VMCB control area at location C8h on all #VMEXITs that
> are due to instruction intercepts, as defined in Section 15.8 on
> page 378, as well as MSR and IOIO intercepts and exceptions
> caused by the INT3, INTO, and BOUND instructions. For all other
> intercepts, nRIP is reset to zero.
>
> There are a few intercepts that may need injection when running nested
> immediatly after an instruction emulation on the host side:
>
> INTR, NMI
> #PF, #GP, ...
>
> All these instructions do not provide a valid next_rip on #vmexit so we
> should be save here. The other way around, copying back a next_rip
> pointer when there should be none, should also not happen as far as I
> see it. The next_rip is only set for instruction intercepts which are
> either handled on the host side or reinjected directly into the L1
> hypervisor.
> When you don't see a failing case either, I think we are save with this
> simple implementation.

I agree, looks like everything's fine here.

We have a slightly different problem, if the nested guest manages to get
an instruction to be emulated by the host (if the guest assigned it the
cirrus framebuffer, for example, so from L1's point of view it is RAM,
but from L0's point of view it is emulated), then we miss the
intercept. L2 could take over L1 this way.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Roedel, Joerg on
On Wed, Jul 28, 2010 at 06:28:06AM -0400, Avi Kivity wrote:
> We have a slightly different problem, if the nested guest manages to get
> an instruction to be emulated by the host (if the guest assigned it the
> cirrus framebuffer, for example, so from L1's point of view it is RAM,
> but from L0's point of view it is emulated), then we miss the
> intercept. L2 could take over L1 this way.

I wonder how this could happen. Shouldn't the shadow paging code take
care of this?

Joerg

--
Joerg Roedel - AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on
On 07/28/2010 02:25 PM, Roedel, Joerg wrote:
> On Wed, Jul 28, 2010 at 06:28:06AM -0400, Avi Kivity wrote:
>> We have a slightly different problem, if the nested guest manages to get
>> an instruction to be emulated by the host (if the guest assigned it the
>> cirrus framebuffer, for example, so from L1's point of view it is RAM,
>> but from L0's point of view it is emulated), then we miss the
>> intercept. L2 could take over L1 this way.
> I wonder how this could happen. Shouldn't the shadow paging code take
> care of this?
>

L1 thinks the memory is RAM, so it maps it directly and forgets about
it. L0 knows it isn't, so it leaves it unmapped and emulates any
instruction which accesses it. The emulator needs to check whether the
instruction is intercepted or not.

Note, I think if the instruction operand is in mmio, we're safe, since
the intercept has higher priority than memory access. But if the
instruction itself is on mmio, or if we entered the emulator through smp
trickery, then the emulator will execute the instruction in nested guest
context.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/