From: Alexander Terekhov on

David Hopwood wrote:

[... SSE2 LFENCE ...]

> It's not entirely clear what "globally visible" in the Intel manual

It's just copy&paste leftover from SSE1 SFENCE description.

regards,
alexander.
From: Joe Seigh on
David Hopwood wrote:
> Joe Seigh wrote:
>
>> Alexander Terekhov wrote:
>>
>>> So where do you put the fence, then?
>>>
>>> : processor 1 stores into X
>>> : processor 2 see the store by 1 into X and stores into Y
>>> : processor 3 loads from Y
>>> : processor 3 loads from X
>>
>>
>> Since this was my example I should clarify. It was meant to
>> show that PC alone wasn't sufficient to guarantee that if processor
>> 3 saw the store into Y by processor 2 that it would see the
>> store into X by processor 1.
>>
>> My understanding of the ia32 memory model is that you
>> need a fence instruction between the loads by processor 3
>> and a fence between the load and store by processor 2 to
>> make the guarantee work.
>
>
> My understanding is that if the claimed problem exists at all, adding
> these fences won't fix it (as far as the model is concerned, possibly
> as opposed to implementation details of specific chips).
>

The architected memory model as opposed to the implemented one?

"Despite the fact that Pentium 4, Intel Xeon, and P6 family
processors support processor ordering, Intel does not guarantee that future processors will
support this model. To make software portable to future processors, it is recommended that operating
systems provide critical region and resource control constructs and APIýs (application
program interfaces) based on I/O, locking, and/or serializing instructions be used to synchronize
access to shared areas of memory in multiple-processor systems."

That one? And what to people think the memory model that only
"I/O, locking, and/or serializing instructions" can synchronize is?

--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.
From: David Hopwood on
Joe Seigh wrote:
> David Hopwood wrote:
>> Joe Seigh wrote:
>>> Alexander Terekhov wrote:
>>>
>>>> So where do you put the fence, then?
>>>>
>>>> : processor 1 stores into X
>>>> : processor 2 see the store by 1 into X and stores into Y
>>>> : processor 3 loads from Y
>>>> : processor 3 loads from X
>>>
>>> Since this was my example I should clarify. It was meant to
>>> show that PC alone wasn't sufficient to guarantee that if processor
>>> 3 saw the store into Y by processor 2 that it would see the
>>> store into X by processor 1.
>>>
>>> My understanding of the ia32 memory model is that you
>>> need a fence instruction between the loads by processor 3
>>> and a fence between the load and store by processor 2 to
>>> make the guarantee work.
>>
>> My understanding is that if the claimed problem exists at all, adding
>> these fences won't fix it (as far as the model is concerned, possibly
>> as opposed to implementation details of specific chips).
>
> The architected memory model as opposed to the implemented one?

Yes, that's what I said.

> "Despite the fact that Pentium 4, Intel Xeon, and P6 family
> processors support processor ordering, Intel does not guarantee that
> future processors will support this model. To make software portable
> to future processors, it is recommended that operating systems provide
> critical region and resource control constructs and API’s (application
> program interfaces) based on I/O, locking, and/or serializing
> instructions be used to synchronize access to shared areas of
> memory in multiple-processor systems."

This is all perfectly sensible. "Future processors" from Intel are not
necessarily ISA-compatible with x86 anyway. For example, you need to
recompile to use long mode in EM64T. Also note that it doesn't say
"future x86 processors". Maybe they were talking about Itanic.

Even if they weren't talking about IA-64 or a different mode, it's
still a good idea to avoid dependencies on the memory model in
*applications*, since it is more difficult to change all apps that
have such dependencies than it is to change threading libraries in OS
and language implementations. In fact OS/lang-impl maintainers half
expect stuff to rot on new hardware, and hopefully remember what they
depended on. Application maintainers generally don't (if they ever
understood it in the first place). This is what I've been saying
consistently.

Anyway, this issue doesn't have anything to do with what we were talking
about, which is whether the current architected x86 model allows a
particular behaviour.

> That one? And what do people think the memory model that only
> "I/O, locking, and/or serializing instructions" can synchronize is?

You're overanalysing a fairly loosely worded recommendation.

--
David Hopwood <david.nospam.hopwood(a)blueyonder.co.uk>
From: Joe Seigh on
David Hopwood wrote:
> Joe Seigh wrote:
>
>> "Despite the fact that Pentium 4, Intel Xeon, and P6 family
>> processors support processor ordering, Intel does not guarantee that
>> future processors will support this model. To make software portable
>> to future processors, it is recommended that operating systems provide
>> critical region and resource control constructs and APIýs (application
>> program interfaces) based on I/O, locking, and/or serializing
>> instructions be used to synchronize access to shared areas of
>> memory in multiple-processor systems."
>
>
> This is all perfectly sensible. "Future processors" from Intel are not
> necessarily ISA-compatible with x86 anyway. For example, you need to
> recompile to use long mode in EM64T. Also note that it doesn't say
> "future x86 processors". Maybe they were talking about Itanic.
>
> Even if they weren't talking about IA-64 or a different mode, it's
> still a good idea to avoid dependencies on the memory model in
> *applications*, since it is more difficult to change all apps that
> have such dependencies than it is to change threading libraries in OS
> and language implementations. In fact OS/lang-impl maintainers half
> expect stuff to rot on new hardware, and hopefully remember what they
> depended on. Application maintainers generally don't (if they ever
> understood it in the first place). This is what I've been saying
> consistently.

Yes, your adversion to anarchist application programmers doing their
own thing is well known. :)

>
> Anyway, this issue doesn't have anything to do with what we were talking
> about, which is whether the current architected x86 model allows a
> particular behaviour.
>
>> That one? And what do people think the memory model that only
>> "I/O, locking, and/or serializing instructions" can synchronize is?
>
>
> You're overanalysing a fairly loosely worded recommendation.
>

I'm not sure what you're saying here. That all future processors
from Intel that don't have processor ordering won't be x86? And
that the synchronization intructions in these future processors
won't be similar to the one's in x86? That Intel is telling people
in an x86 manual to start writing portable code not now but when
they get to the future processor? That's a little strange even for
Intel.


--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.
From: David Hopwood on
Joe Seigh wrote:
> David Hopwood wrote:
>> Joe Seigh wrote:
>>
>>> "Despite the fact that Pentium 4, Intel Xeon, and P6 family
>>> processors support processor ordering, Intel does not guarantee that
>>> future processors will support this model. To make software portable
>>> to future processors, it is recommended that operating systems provide
>>> critical region and resource control constructs and API’s (application
>>> program interfaces) based on I/O, locking, and/or serializing
>>> instructions be used to synchronize access to shared areas of
>>> memory in multiple-processor systems."
>>
>> This is all perfectly sensible. "Future processors" from Intel are not
>> necessarily ISA-compatible with x86 anyway. For example, you need to
>> recompile to use long mode in EM64T. Also note that it doesn't say
>> "future x86 processors". Maybe they were talking about Itanic.
>>
>> Even if they weren't talking about IA-64 or a different mode, it's
>> still a good idea to avoid dependencies on the memory model in
>> *applications*, since it is more difficult to change all apps that
>> have such dependencies than it is to change threading libraries in OS
>> and language implementations. In fact OS/lang-impl maintainers half
>> expect stuff to rot on new hardware, and hopefully remember what they
>> depended on. Application maintainers generally don't (if they ever
>> understood it in the first place). This is what I've been saying
>> consistently.
>
> Yes, your adversion to anarchist application programmers doing their
> own thing is well known. :)

Right, I am absolutely convinced that the roles of application
programmer and infrastructure programmer should be clearly separated
(even if there are a few people with the ability and expertise needed
to successfully do both).

>> Anyway, this issue doesn't have anything to do with what we were talking
>> about, which is whether the current architected x86 model allows a
>> particular behaviour.
>>
>>> That one? And what do people think the memory model that only
>>> "I/O, locking, and/or serializing instructions" can synchronize is?
>>
>> You're overanalysing a fairly loosely worded recommendation.
>
> I'm not sure what you're saying here. That all future processors
> from Intel that don't have processor ordering won't be x86?

Well, they won't be x86-as-we-know-it. OSes, compilers, etc. will
have to be changed to run on or generate code for this new x86-like
thing, and changes in the memory model will probably be only one issue
they need to deal with.

> And that the synchronization intructions in these future processors
> won't be similar to the one's in x86? That Intel is telling people
> in an x86 manual to start writing portable code not now but when
> they get to the future processor?

Of course not. Read what they actually wrote.

--
David Hopwood <david.nospam.hopwood(a)blueyonder.co.uk>
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
Prev: CPU <> Memory chip communication interface
Next: Multicores