i686 quirk for AMD Geode [Kernel]

Prev: [PATCH 1/1] PM: Thaws refrigerated and to be exited kernel threads
Next: drm/ksm -> s2disk -> resume -> [drm:r100_ring_test] *ERROR* radeon: ring test failed (sracth(0x15E4)=0xCAFEDEAD)

From: H. Peter Anvin on 10 Nov 2009 17:30

On 11/10/2009 02:06 PM, Willy Tarreau wrote:
> On Tue, Nov 10, 2009 at 01:19:30PM -0800, H. Peter Anvin wrote:
>> Willy, perhaps you can come up with a list of features you think should
>> be emulated, together with an explanation of why you opted for that list
>> of features and *did not* opt for others.
>
> Well, the instructions I had to emulate were the result of failures
> to run standard distros on older machines. When I ran a 486 distro
> on my old 386, I found that almost everything worked except a few
> programs making use of BSWAP for htonl(), and a small group of other
> ones making occasional use of CMPXCHG for mutex handling. So I checked
> the differences between 386 and 486 and found that the last remaining
> one was XADD which I did not find in my binaries but which was really
> obvious to implement, so it made sense to complete the emulator. That
> said, a feature was missing with CMPXCHG. It was generally used with
> a LOCK prefix which could not be emulated. In practice, that wasn't
> an issue since I did not have any SMP i386 and I think we might only
> find them on some very specific industrial boards if any.
>

Linux doesn't support 386 SMP anyway, and we really don't support 486
SMP either since noone relevant seems to have a board we can test on.

> But what I can say is that after emulating those instructions, I
> never got any illegal instruction anymore on my systems. Here
> Matteo reports an issue with NOPL, which might have been introduced
> with newer compilers. So if we get NOPL+CMOV, I think that every
> CPU starting from 486 will be able to execute all the applications
> I have been running on those machines. We can add the 486 ones if
> we think it's worth it.

NOPL was introduced with a recent binutils(!) change. They might have
backed that one out.

> Once again, I have no argument against emulating more instructions.
> It's just that I never needed them, and I fear that doing so might
> render the code a lot more complex and slower. Maybe time will prove
> me wrong and I will have no problem with that. We can re-open this
> thread after the first report of a SIGILL with the patch applied.
>
> So in my opinion, we should have :
> - CMOV (for 486, Pentium, C3, K6, ...)
> - NOPL (newcomer)
>
> And if we want to extend down to i386 :
> - BSWAP (=htonl)
> - CMPXCHG (mutex)
> - XADD (never encoutered but cheap)
>
> I still have the 2.4 patch for BSWAP, CMPXCHG, CMOV and XADD lying
> around. I'm appending it to the end of this mail in case it can fuel
> the discussion. I've not ported it to 2.6 yet simply because my old
> systems are still on 2.4, but volunteers are welcome :-)
>
>> Note: emulated FPU is a special subcase. The FPU operations are
>> heavyweight enough that the overhead of trapping versus library calls is
>> relatively insignificant.
>
> Agreed for most of them, though some cheap ones such as FADD can
> see a huge difference. In fact it's mostly that it's been common
> for a long time to see slow software FPU (till 386 & 486-SX), so
> it's been avoided for a long time.
>
> Regards,
> Willy

I immediately note that you have absolutely no check on the code
segment, either in terms of code segment limits or even that we're in
the right mode. Furthermore, you read user space -- code in user space
is still user space -- without get_user(). We also need NX protection
to be honoured, and the various special subtleties of the x86
instruction format (15-byte limit, for example) to be preserved: they
aren't just there randomly, but are there to protect against specific
failures.

*THIS* is the kind of complexity that makes me think that having a
single source for all interpretation done in the kernel is the preferred
option.

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 10 Nov 2009 17:30

* H. Peter Anvin <hpa(a)zytor.com> wrote:

> *THIS* is the kind of complexity that makes me think that having a
> single source for all interpretation done in the kernel is the
> preferred option.

Definitely agreed ... The NX code is quite a maze right now, so changes
to it should come generously laced with cleanups.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Lennart Sorensen on 10 Nov 2009 17:30

On Tue, Nov 10, 2009 at 01:19:30PM -0800, H. Peter Anvin wrote:
> Willy, perhaps you can come up with a list of features you think should
> be emulated, together with an explanation of why you opted for that list
> of features and *did not* opt for others.
>
> Note: emulated FPU is a special subcase. The FPU operations are
> heavyweight enough that the overhead of trapping versus library calls is
> relatively insignificant.

That doesn't seem to be the experience of the arm EABI versus the old
arm ABI with kernel FPU emulation. Using user space library calls for
FPU is vastly faster than the trapping and kernel FPU emulation.

--
Len Sorensen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: H. Peter Anvin on 10 Nov 2009 17:40

On 11/10/2009 02:27 PM, Lennart Sorensen wrote:
> On Tue, Nov 10, 2009 at 01:19:30PM -0800, H. Peter Anvin wrote:
>> Willy, perhaps you can come up with a list of features you think should
>> be emulated, together with an explanation of why you opted for that list
>> of features and *did not* opt for others.
>>
>> Note: emulated FPU is a special subcase. The FPU operations are
>> heavyweight enough that the overhead of trapping versus library calls is
>> relatively insignificant.
>
> That doesn't seem to be the experience of the arm EABI versus the old
> arm ABI with kernel FPU emulation. Using user space library calls for
> FPU is vastly faster than the trapping and kernel FPU emulation.

I don't believe we were talking about ARM.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Lennart Sorensen on 10 Nov 2009 17:40

On Tue, Nov 10, 2009 at 02:29:33PM -0800, H. Peter Anvin wrote:
> I don't believe we were talking about ARM.

True. I do get the impression the ARM has higher trap overhead than x86.

--
Len Sorensen
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/