From: Milton Miller on
On Mon Jul 19 2010 at about 03:36:51 EST, Alexander Graf wrote:
> On 19.07.2010, at 03:11, Benjamin Herrenschmidt wrote:
>
> > On Thu, 2010-07-15 at 17:05 +0530, Subrata Modak wrote:
> > > commit e62cee42e66dcca83aae02748535f62e0f564a0c solved the problem for
> > > 2.6.34-rc6. However some other bad relocation warnings generated against
> > > 2.6.35-rc5 on Power7/ppc64 below:
> > >
> > > MODPOST 2004 modules^M
> > > WARNING: 2 bad relocations^M
> > > c000000000008590 R_PPC64_ADDR32 .text+0x4000000000008460^M
> > > c000000000008594 R_PPC64_ADDR32 .text+0x4000000000008598^M
> >
> > I think this is KVM + CONFIG_RELOCATABLE. Caused by:
> >
> > .global kvmppc_trampoline_lowmem
> > kvmppc_trampoline_lowmem:
> > .long kvmppc_handler_lowmem_trampoline - CONFIG_KERNEL_START
> >
> > .global kvmppc_trampoline_enter
> > kvmppc_trampoline_enter:
> > .long kvmppc_handler_trampoline_enter - CONFIG_KERNEL_START
> >
> > Alex, can you turn these into 64-bit on ppc64 so the relocator
> > can grok them ?
>
> If I turn them into 64-bit, will the values be > RMA? In that case
> things would break anyways. How does relocation work on PPC? Are the
> first few megs copied over to low memory? Would I have to mask anything
> in the above code to make sure I use the real values?
>
> Alex
>

You can still do the subtraction, but you have to allocate 64 bits for
storage. Relocatable ppc64 kernels work by adjusting PPC64_RELOC_RELATIVE
entries during early boot (reloc in reloc_64.S called from head_64.S).

The code purposely only supports 64 bit relative addressing.

milton
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Milton Miller on
I wrote:
> On Mon Jul 19 2010 at about 03:36:51 EST, Alexander Graf wrote:
> > On 19.07.2010, at 03:11, Benjamin Herrenschmidt wrote:
> >
> > > On Thu, 2010-07-15 at 17:05 +0530, Subrata Modak wrote:
> > > > commit e62cee42e66dcca83aae02748535f62e0f564a0c solved the problem for
> > > > 2.6.34-rc6. However some other bad relocation warnings generated against
> > > > 2.6.35-rc5 on Power7/ppc64 below:
> > > >
> > > > MODPOST 2004 modules^M
> > > > WARNING: 2 bad relocations^M
> > > > c000000000008590 R_PPC64_ADDR32 .text+0x4000000000008460^M
> > > > c000000000008594 R_PPC64_ADDR32 .text+0x4000000000008598^M
> > >
> > > I think this is KVM + CONFIG_RELOCATABLE. Caused by:
> > >
> > > .global kvmppc_trampoline_lowmem
> > > kvmppc_trampoline_lowmem:
> > > .long kvmppc_handler_lowmem_trampoline - CONFIG_KERNEL_START
> > >
> > > .global kvmppc_trampoline_enter
> > > kvmppc_trampoline_enter:
> > > .long kvmppc_handler_trampoline_enter - CONFIG_KERNEL_START
> > >
> > > Alex, can you turn these into 64-bit on ppc64 so the relocator
> > > can grok them ?
> >
> > If I turn them into 64-bit, will the values be > RMA? In that case
> > things would break anyways. How does relocation work on PPC? Are the
> > first few megs copied over to low memory? Would I have to mask anything
> > in the above code to make sure I use the real values?
> >
> > Alex
> >
>
> You can still do the subtraction, but you have to allocate 64 bits for
> storage. Relocatable ppc64 kernels work by adjusting PPC64_RELOC_RELATIVE
> entries during early boot (reloc in reloc_64.S called from head_64.S).
>
> The code purposely only supports 64 bit relative addressing.

Oh yea, and for book-3s, the code copies from 0x100 to __end_interrupts
in arch/powerpc/kernel/exceptions-64s.h down to the real 0, but the rest
of the kernel is at some disjointed address. The interrupt will go to
the copy at the real zero. Any references to code outside that region
must be done via a full indrect branch (not a relative one), simiar to
the secondary startup (via following the function pointer in a descriptor
set in very low memory), or syscall entry and exception vectors via paca.

Book-3e (64 and 32 bit) are different. I forget how classic 32 bit works,
it may still have a < 32MB limitation.

milton
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alexander Graf on

On 20.07.2010, at 09:27, Milton Miller wrote:

> On Mon, 19 Jul 2010 about 14:00:56 +0200, Alexander Graf wrote:
>> Milton Miller wrote:
>>> I wrote:
>>>
>>> Oh yea, and for book-3s, the code copies from 0x100 to __end_interrupts
>>> in arch/powerpc/kernel/exceptions-64s.h down to the real 0, but the rest
>>> of the kernel is at some disjointed address. The interrupt will go to
>>> the copy at the real zero. Any references to code outside that region
>>> must be done via a full indrect branch (not a relative one), simiar to
>>> the secondary startup (via following the function pointer in a descriptor
>>> set in very low memory), or syscall entry and exception vectors via paca.
>>>
>>
>> That would still break on normal PPC boxes, as any address accessed in
>> real mode has to be inside the RMA. And the #include for
>> kvm/book3s_rmhandlers.S happens after __end_interrupts. So I'd end up
>> with code that gets executed outside of the RMA after a relocation, right?
>>
>> Alex
>>
>
> Weither its outside of the RMA or not, DO_KVM is creating a branch outside
> of code copied to lowmem.
>
> This is BROKEN.
>
> We have a hard limit that we can't extend _end_interrupts past 0x7000, and
> a soft limit that we can't exceed 0x6000. If there is space, we could
> move the real mode handler extensions inside end_interrupts in
> exceptions-64s.S, and store the full address in a .quad so it gets
> relocated properly. Don't subtract the start, we have designed the kernel
> to run with start at a VA that can be used as a EA in real mode.

Moving everything to exceptions-64s.S sounds like the best thing to do. All the code in real mode really is there so it stays inside the RMA. I don't think we can guarantee that for any code that is not copied, right?

> Otherwise we need to mark KVM_BOOK3S_64 depends on (!RELOCATABLE ||
> BROKEN) for 2.6.35 until we get fixes.

Well - it's only broken when really getting relocated. But I agree, the current state doesn't cope with Linux's relocation logic.

> I took a read though the book3s code as of 2.6.34. A few things I noticed:
>
> (1) The code is using slb large to control the segment size. It should
> be using SLB B field (or just impliment 256M segments only).

I'm not sure I understand this part? We only use 256MB segments for now.

> (2) It appears that the mtspr and mfspr code is using the same storage for
> bats 4-7 as 0-3 ... I would have expected a 4 + a few places.

Yes, that one is fixed in more recent versions already.

> (3) Its not clear to me that you clear RI when transitioning to the guest
> but its obviously required because you place state in srr0 & srr1.

Uh - do I have to clear RI? I'm not prepared to take an interrupt anyways and RI is just a soft flag for Linux's handlers, right?

> (4) I don't understand why __kvmppc_vcpu_run turns on interrupts so that
> __kvmppc_vcpu_entry can turn them back off. Something to do with
> irq trace annotations?

__kvmppc_vcpu_run turns on soft interrupts while __kvmppc_vcpu_entry turns them off in MSR. This is so that when enabling interrupts again on guest exit, we have the soft enable bit set.


Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/