From: Zachary Amsden on
On 07/14/2010 10:40 AM, Jeremy Fitzhardinge wrote:
> On 07/14/2010 01:16 PM, Avi Kivity wrote:
>
>> On 07/14/2010 08:57 PM, Jeremy Fitzhardinge wrote:
>>
>>> Anything else?
>>>
>> 1. set up a mapping
>> 2. invlpg or set cr3
>> 3. use the mapping
>>
>> Moving the invlpg will break your code.
>>
> invlpg uses memory clobbers. All the crX ops seem to use a
> __force_order variable to sequence them - but it looks like it's done
> precisely backwards and it's barking mad to do allow write_crX to be
> reordered with respect to memory ops.
>
> Hm, looks like glommer added it surreptitiously while unifying
> system_32.h and system_64.h (system_32.h relied on asm volatile not
> being reordered; system_64.h used memory clobbers).
> J
>

clts() has no memory clobber; it is used to serialize execution of code
within kernel_fpu_begin() / kernel_fpu_end() blocks.

If the code within is reordered before the clts(), we've corrupted guest
FPU state.

That's the kind of bug I think Linus is talking about. We've been
expecting volatile to work that way for over a decade, by my
recollection, and if it doesn't, there is going to be a lot of broken code.

Shouldn't we at least get a compiler switch to force the volatile
behavior? I'd suggest it default to conservative.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Zachary Amsden on
On 07/14/2010 10:45 AM, Zachary Amsden wrote:
> On 07/14/2010 10:40 AM, Jeremy Fitzhardinge wrote:
>> On 07/14/2010 01:16 PM, Avi Kivity wrote:
>>> On 07/14/2010 08:57 PM, Jeremy Fitzhardinge wrote:
>>>> Anything else?
>>> 1. set up a mapping
>>> 2. invlpg or set cr3
>>> 3. use the mapping
>>>
>>> Moving the invlpg will break your code.
>> invlpg uses memory clobbers. All the crX ops seem to use a
>> __force_order variable to sequence them - but it looks like it's done
>> precisely backwards and it's barking mad to do allow write_crX to be
>> reordered with respect to memory ops.
>>
>> Hm, looks like glommer added it surreptitiously while unifying
>> system_32.h and system_64.h (system_32.h relied on asm volatile not
>> being reordered; system_64.h used memory clobbers).
>> J
>
> clts() has no memory clobber; it is used to serialize execution of
> code within kernel_fpu_begin() / kernel_fpu_end() blocks.
>
> If the code within is reordered before the clts(), we've corrupted
> guest FPU state.
>
> That's the kind of bug I think Linus is talking about. We've been
> expecting volatile to work that way for over a decade, by my
> recollection, and if it doesn't, there is going to be a lot of broken
> code.
>
> Shouldn't we at least get a compiler switch to force the volatile
> behavior? I'd suggest it default to conservative.

Hmm, well, despite that not being quite correct (if guest has used FPU,
we save it, which has a memory clobber), it seems to be the case that a
reordering of the clts() among the other volatile asm statements would
be a very bad thing - you'd get kernel FPU exceptions.

And among asm volatiles, clts() is fairly unique in not having any
clobbers or dependencies at all, so it could happen.

Zach
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: H. Peter Anvin on
On 07/14/2010 01:40 PM, Jeremy Fitzhardinge wrote:
> On 07/14/2010 01:16 PM, Avi Kivity wrote:
>> On 07/14/2010 08:57 PM, Jeremy Fitzhardinge wrote:
>>> Anything else?
>>
>> 1. set up a mapping
>> 2. invlpg or set cr3
>> 3. use the mapping
>>
>> Moving the invlpg will break your code.
>
> invlpg uses memory clobbers. All the crX ops seem to use a
> __force_order variable to sequence them - but it looks like it's done
> precisely backwards and it's barking mad to do allow write_crX to be
> reordered with respect to memory ops.
>
> Hm, looks like glommer added it surreptitiously while unifying
> system_32.h and system_64.h (system_32.h relied on asm volatile not
> being reordered; system_64.h used memory clobbers).
> J

invlpg, in the general case, definitely needs a memory clobber even if
volatiles are ordered, since it needs to be ordered with regards to
non-volatile memory operations.

Note that memory clobbers don't by themselves enforce ordering since
they don't prevent the ordering of memory clobbers with respect to each
other.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jeremy Fitzhardinge on
On 07/14/2010 11:19 AM, H. Peter Anvin wrote:
> On 07/14/2010 11:15 AM, Jeremy Fitzhardinge wrote:
>
>> I think we should consider that deprecated and rely on dependencies and
>> clobbers.
>>
>>
> I don't think that's practical in general. If the compiler is *that
> broken*, I don't see how it is usable at all.
>

Well, over the years, gcc has explicitly changed the docs to weaken the
meaning of "asm volatile" from "not being moved ``significantly''" to
"no real guarantees about movement at all".

I don't see why its so broken. There are lots of mechanisms to control
asm ordering; we don't need "asm volatile" as well. But we do need it
to mean "still emit an apparently dead asm with no outputs (or unused
outputs)".

They can still be omitted if the whole basic block is dead code, and
they can be duplicated by things like inlining and loop unrolling. But
presumably they can never be evaluated more times than the source code
says they should be, so to that extent they're like volatile variables.

I think the use of the "volatile" keyword here is a red herring. Just
because the gcc devs decided to use it as a qualifier for asm statements
doesn't mean that one should read anything other than a vague, vague
relationship to volatile variables (and extra-vague given how weakly
defined *they* are).

> As Linus indicated, I don't think we can assume the gcc documentation to
> accurately reflect the intent of the gcc team, mostly because I think it
> often lags way behind what they're thinking.
>

I can get not trusting gcc to do what the documentation says it should
do, but relying on it to do things that the documentation definitely
says it doesn't seems foolhardy.

I'm finding it a bit surreal to be arguing on the side of "don't trust
gcc" vs you and Linus on the "gcc developers are arbitrary, capricious
and untrustworthy, but we can rely on this piece of undocumented
behaviour" side.

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Jeremy Fitzhardinge on
On 07/14/2010 12:40 PM, H.J. Lu wrote:
> On Wed, Jul 14, 2010 at 12:36 PM, H. Peter Anvin <hpa(a)zytor.com> wrote:
>
>> On 07/14/2010 12:32 PM, H.J. Lu wrote:
>>
>>> On Wed, Jul 14, 2010 at 12:00 PM, H. Peter Anvin <hpa(a)zytor.com> wrote:
>>>
>>>> On 07/14/2010 11:18 AM, H.J. Lu wrote:
>>>>
>>>>> There are some discussions on:
>>>>>
>>>>> http://gcc.gnu.org/ml/gcc-patches/2010-06/msg02001.html
>>>>> http://gcc.gnu.org/ml/gcc-patches/2010-07/msg00001.html
>>>>>
>>>>> Are they related?
>>>>>
>>>>>
>>>> Not directly as far as I can tell.
>>>>
>>>> The issue is if gcc can ever reorder, duplicate or elide a volatile
>>>> operation (either asm volatile or a volatile-annotated memory
>>>> reference.) In my (and Linus') opinion, this would be an incredibly
>>>> serious bug.
>>>>
>>> Is there a gcc bug for this?
>>>
>>>
>> Are you asking for a bug report against the documentation? We're not
>> sure what the semantics intended by the gcc team to be, which I guess is
>> a documentation bug.
>>
>>
> Documentation bug is also a bug :-).
>

The question is "what are the real ordering semantics of asm volatile"?
What ordering is enforced between other asm volatiles? What ordering is
enforced between asm volatiles and regular memory accesses? asm volatile
and other code?

The documentation discusses this to some extent, but mostly says "there
are no ordering guarantees". Older versions of gcc - 2.95, for example
- are more explicit, saying that "asm volatiles" won't be moved out of
their basic block (I think that's how I parse it, anyway).

Linux relies on "asm volatile" being ordered at least with respect to
other asm volatiles. Is this reasonable now? Will gcc break this at
some point in the future? If we can't rely on "asm volatile" ordering
semantics, what other mechanism can we use (for example, to order clts
with respect to FPU-using code)?

Thanks,
J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/