i686 quirk for AMD Geode [Kernel]

Prev: [PATCH 1/1] PM: Thaws refrigerated and to be exited kernel threads
Next: drm/ksm -> s2disk -> resume -> [drm:r100_ring_test] *ERROR* radeon: ring test failed (sracth(0x15E4)=0xCAFEDEAD)

From: Bernd Petrovitsch on 11 Nov 2009 06:00

Hi!

On Tue, 2009-11-10 at 21:54 +0100, Pavel Machek wrote:
[...]
> > CMOV/NOPL are rarely used, thus have no reason to cause a massive
> > performance drop, but are frequent enough (at least cmov) for almost
>
> *One* CMOV in the inner loop will make your performance go down 20x.
But it runs.
The pragmatic side is:
If people notices the performance drop, it would be good to have
something in syslog and/or dmesg and/or /proc and/or sysfs
If people do not notice the performance drop, who cares?

Bernd
--
Firmix Software GmbH http://www.firmix.at/
mobil: +43 664 4416156 fax: +43 1 7890849-55
Embedded Linux Development and Services

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: H. Peter Anvin on 11 Nov 2009 11:30

On 11/11/2009 02:43 AM, Willy Tarreau wrote:
>
> well, ironically the KVM decoder can decode an infinite string
> of prefixes while the very simple and limited one in the patch
> I showed did correct checks for invalid cases (multiple segments,
> repeated locks, etc...). It would only accept one data size prefix,
> one address size prefix, one lock and one segment prefix.
>
> I have nothing against the KVM one, it's just that it's a
> full-featured emulator while we were speaking about a 686
> emulators for lower-end processors. 98% of the instructions
> supported by KVM will never be used for that purpose. This
> is where I see a waste. We're comparing 7000 lines of code
> supporting 64-bit, real mode, NX, etc... to 400. I fail to
> see how we can guarantee that we do it right in that larger
> code (and the example above proves it wrong).
>
> And as you said, NX is not an issue on the CPUs we're
> targetting.
>

RIGHT NOW. Except that, guess what, once we have emulation in the
kernel, people are going to demand new instructions to cover *new* gaps
in the instruction set. And yes, this is going to mean 64 bits and what
not.

The main reason to unify with KVM is not because KVM is doing everything
right (I am perfectly aware that it doesn't), but because I don't really
want to see a plethora of half-arsed emulators spread across the kernel,
each with its own bugs. If unified, at least there is one codebase
which can get fixed.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: H. Peter Anvin on 11 Nov 2009 20:10

On 11/11/2009 04:51 PM, Daniel Pittman wrote:
>>
>> Consider SSE3, for example. Why should the same concept not apply to
>> SSE3 instructions as to CMOV?
>
> FWIW, the issue of the binary-only flashplayer.so came up later in the thread,
> but to add my few cents:
>
> When flash 10 was released the binary only 64-bit version generated
> instructions from the LAHF set unconditionally, in part because Windows chose
> to emulate those on the very few x86-64 platforms that didn't do them in
> hardware.
>
> At that time it would have been very nice from a "user support" point of view
> to be able to add LAHF emulation to support the software. Yes, it is ugly,
> binary-only code, but it is reasonably popular...
>
> ...in the end, in fact, popular enough to have at least a couple of people
> I know purchase a new CPU that did implement it, just for flash on Linux.

The main use case for emulation is indeed to support binary-only or
otherwise precompiled software that exposes holes in the instruction
set. As such, emulation can also be used to "raise the baseline", which
can be a highly desirable thing to do.

My point in all of this is that this is not a static problem, and that
if we're going to do emulation we need to consider the requirements
going forward. I would *prefer* to have only one interpreter to deal
with when it's broken, and I certainly trust Avi & co to do the right
thing, but I'm certainly willing to entertain technical reasons why it
is not the right thing to do -- *not just now but in the future*. The
latter is an absolutely critical constraint, though.

Once we have a general enough interpreter framework, we can add new
instructions as needed; it should make it a lot easier to phase in new
instructions while not breaking old legacy machines.

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Daniel Pittman on 11 Nov 2009 20:10

"H. Peter Anvin" <hpa(a)zytor.com> writes:
> On 11/10/2009 09:24 AM, Alan Cox wrote:
>>>
>>> In the short term, yes, of course. However, if we're going to do
>>> emulation, we might as well do it right.
>>
>> Why is using KVM doing it right ? It sounds like its doing it slowly,
>> and hideously memory inefficiently. You are solving an uninteresting
>> general case problem when you just need two tiny fixups (or perhaps 3 if
>> you want to fix up early x86-64 prefetch)
>
> Why do we only need "two tiny fixups"? Where do we draw the line in
> terms of ISA compatibility? One could easily argue that the Right
> Thing[TM] is to be able to process any optional instruction -- otherwise
> one has a very difficult place to draw a line.
>
> Consider SSE3, for example. Why should the same concept not apply to
> SSE3 instructions as to CMOV?

FWIW, the issue of the binary-only flashplayer.so came up later in the thread,
but to add my few cents:

When flash 10 was released the binary only 64-bit version generated
instructions from the LAHF set unconditionally, in part because Windows chose
to emulate those on the very few x86-64 platforms that didn't do them in
hardware.

At that time it would have been very nice from a "user support" point of view
to be able to add LAHF emulation to support the software. Yes, it is ugly,
binary-only code, but it is reasonably popular...

Daniel

....in the end, in fact, popular enough to have at least a couple of people
I know purchase a new CPU that did implement it, just for flash on Linux.
--
✣ Daniel Pittman ✉ daniel(a)rimspace.net ☎ +61 401 155 707
♽ made with 100 percent post-consumer electrons
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Matt Thrailkill on 11 Nov 2009 21:30

On Wed, Nov 11, 2009 at 1:32 AM, Willy Tarreau <w(a)1wt.eu> wrote:
> All I can say is that executing a NOP results in no state change in
> the processor except the instruction pointer which points to the
> next instruction after execution. Since a NOP changes nothing, it
> cannot be used alone to provide any privilege, access to data or
> any such thing. Since it does not perform any jump, it cannot either
> be used to take back control of the execution flow. And it is certain
> that the next instruction after it will be executed, so if the NOP
> crosses a page boundary and completes on a non-executable one, the
> next instruction will trigger the PF.
>
> So I can't see how a NOP can be used to circumvent any protection.

So a nop(l) sled won't be a problem, right?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/