|
From: Alexander Terekhov on 5 Sep 2005 14:27 David Hopwood wrote: [... SSE2 LFENCE ...] > It's not entirely clear what "globally visible" in the Intel manual It's just copy&paste leftover from SSE1 SFENCE description. regards, alexander.
From: Joe Seigh on 5 Sep 2005 16:21 David Hopwood wrote: > Joe Seigh wrote: > >> Alexander Terekhov wrote: >> >>> So where do you put the fence, then? >>> >>> : processor 1 stores into X >>> : processor 2 see the store by 1 into X and stores into Y >>> : processor 3 loads from Y >>> : processor 3 loads from X >> >> >> Since this was my example I should clarify. It was meant to >> show that PC alone wasn't sufficient to guarantee that if processor >> 3 saw the store into Y by processor 2 that it would see the >> store into X by processor 1. >> >> My understanding of the ia32 memory model is that you >> need a fence instruction between the loads by processor 3 >> and a fence between the load and store by processor 2 to >> make the guarantee work. > > > My understanding is that if the claimed problem exists at all, adding > these fences won't fix it (as far as the model is concerned, possibly > as opposed to implementation details of specific chips). > The architected memory model as opposed to the implemented one? "Despite the fact that Pentium 4, Intel Xeon, and P6 family processors support processor ordering, Intel does not guarantee that future processors will support this model. To make software portable to future processors, it is recommended that operating systems provide critical region and resource control constructs and APIýs (application program interfaces) based on I/O, locking, and/or serializing instructions be used to synchronize access to shared areas of memory in multiple-processor systems." That one? And what to people think the memory model that only "I/O, locking, and/or serializing instructions" can synchronize is? -- Joe Seigh When you get lemons, you make lemonade. When you get hardware, you make software.
From: David Hopwood on 5 Sep 2005 17:21 Joe Seigh wrote: > David Hopwood wrote: >> Joe Seigh wrote: >>> Alexander Terekhov wrote: >>> >>>> So where do you put the fence, then? >>>> >>>> : processor 1 stores into X >>>> : processor 2 see the store by 1 into X and stores into Y >>>> : processor 3 loads from Y >>>> : processor 3 loads from X >>> >>> Since this was my example I should clarify. It was meant to >>> show that PC alone wasn't sufficient to guarantee that if processor >>> 3 saw the store into Y by processor 2 that it would see the >>> store into X by processor 1. >>> >>> My understanding of the ia32 memory model is that you >>> need a fence instruction between the loads by processor 3 >>> and a fence between the load and store by processor 2 to >>> make the guarantee work. >> >> My understanding is that if the claimed problem exists at all, adding >> these fences won't fix it (as far as the model is concerned, possibly >> as opposed to implementation details of specific chips). > > The architected memory model as opposed to the implemented one? Yes, that's what I said. > "Despite the fact that Pentium 4, Intel Xeon, and P6 family > processors support processor ordering, Intel does not guarantee that > future processors will support this model. To make software portable > to future processors, it is recommended that operating systems provide > critical region and resource control constructs and API’s (application > program interfaces) based on I/O, locking, and/or serializing > instructions be used to synchronize access to shared areas of > memory in multiple-processor systems." This is all perfectly sensible. "Future processors" from Intel are not necessarily ISA-compatible with x86 anyway. For example, you need to recompile to use long mode in EM64T. Also note that it doesn't say "future x86 processors". Maybe they were talking about Itanic. Even if they weren't talking about IA-64 or a different mode, it's still a good idea to avoid dependencies on the memory model in *applications*, since it is more difficult to change all apps that have such dependencies than it is to change threading libraries in OS and language implementations. In fact OS/lang-impl maintainers half expect stuff to rot on new hardware, and hopefully remember what they depended on. Application maintainers generally don't (if they ever understood it in the first place). This is what I've been saying consistently. Anyway, this issue doesn't have anything to do with what we were talking about, which is whether the current architected x86 model allows a particular behaviour. > That one? And what do people think the memory model that only > "I/O, locking, and/or serializing instructions" can synchronize is? You're overanalysing a fairly loosely worded recommendation. -- David Hopwood <david.nospam.hopwood(a)blueyonder.co.uk>
From: Joe Seigh on 5 Sep 2005 18:32 David Hopwood wrote: > Joe Seigh wrote: > >> "Despite the fact that Pentium 4, Intel Xeon, and P6 family >> processors support processor ordering, Intel does not guarantee that >> future processors will support this model. To make software portable >> to future processors, it is recommended that operating systems provide >> critical region and resource control constructs and APIýs (application >> program interfaces) based on I/O, locking, and/or serializing >> instructions be used to synchronize access to shared areas of >> memory in multiple-processor systems." > > > This is all perfectly sensible. "Future processors" from Intel are not > necessarily ISA-compatible with x86 anyway. For example, you need to > recompile to use long mode in EM64T. Also note that it doesn't say > "future x86 processors". Maybe they were talking about Itanic. > > Even if they weren't talking about IA-64 or a different mode, it's > still a good idea to avoid dependencies on the memory model in > *applications*, since it is more difficult to change all apps that > have such dependencies than it is to change threading libraries in OS > and language implementations. In fact OS/lang-impl maintainers half > expect stuff to rot on new hardware, and hopefully remember what they > depended on. Application maintainers generally don't (if they ever > understood it in the first place). This is what I've been saying > consistently. Yes, your adversion to anarchist application programmers doing their own thing is well known. :) > > Anyway, this issue doesn't have anything to do with what we were talking > about, which is whether the current architected x86 model allows a > particular behaviour. > >> That one? And what do people think the memory model that only >> "I/O, locking, and/or serializing instructions" can synchronize is? > > > You're overanalysing a fairly loosely worded recommendation. > I'm not sure what you're saying here. That all future processors from Intel that don't have processor ordering won't be x86? And that the synchronization intructions in these future processors won't be similar to the one's in x86? That Intel is telling people in an x86 manual to start writing portable code not now but when they get to the future processor? That's a little strange even for Intel. -- Joe Seigh When you get lemons, you make lemonade. When you get hardware, you make software.
From: David Hopwood on 5 Sep 2005 20:26
Joe Seigh wrote: > David Hopwood wrote: >> Joe Seigh wrote: >> >>> "Despite the fact that Pentium 4, Intel Xeon, and P6 family >>> processors support processor ordering, Intel does not guarantee that >>> future processors will support this model. To make software portable >>> to future processors, it is recommended that operating systems provide >>> critical region and resource control constructs and API’s (application >>> program interfaces) based on I/O, locking, and/or serializing >>> instructions be used to synchronize access to shared areas of >>> memory in multiple-processor systems." >> >> This is all perfectly sensible. "Future processors" from Intel are not >> necessarily ISA-compatible with x86 anyway. For example, you need to >> recompile to use long mode in EM64T. Also note that it doesn't say >> "future x86 processors". Maybe they were talking about Itanic. >> >> Even if they weren't talking about IA-64 or a different mode, it's >> still a good idea to avoid dependencies on the memory model in >> *applications*, since it is more difficult to change all apps that >> have such dependencies than it is to change threading libraries in OS >> and language implementations. In fact OS/lang-impl maintainers half >> expect stuff to rot on new hardware, and hopefully remember what they >> depended on. Application maintainers generally don't (if they ever >> understood it in the first place). This is what I've been saying >> consistently. > > Yes, your adversion to anarchist application programmers doing their > own thing is well known. :) Right, I am absolutely convinced that the roles of application programmer and infrastructure programmer should be clearly separated (even if there are a few people with the ability and expertise needed to successfully do both). >> Anyway, this issue doesn't have anything to do with what we were talking >> about, which is whether the current architected x86 model allows a >> particular behaviour. >> >>> That one? And what do people think the memory model that only >>> "I/O, locking, and/or serializing instructions" can synchronize is? >> >> You're overanalysing a fairly loosely worded recommendation. > > I'm not sure what you're saying here. That all future processors > from Intel that don't have processor ordering won't be x86? Well, they won't be x86-as-we-know-it. OSes, compilers, etc. will have to be changed to run on or generate code for this new x86-like thing, and changes in the memory model will probably be only one issue they need to deal with. > And that the synchronization intructions in these future processors > won't be similar to the one's in x86? That Intel is telling people > in an x86 manual to start writing portable code not now but when > they get to the future processor? Of course not. Read what they actually wrote. -- David Hopwood <david.nospam.hopwood(a)blueyonder.co.uk> |