|
From: Alexander Terekhov on 6 Sep 2005 11:29 "Eric P." wrote: > > Alexander Terekhov wrote: > > > > My reading of the specs is that MFENCE is guaranteed to provide > > store-load barrier. > > > > P1: X = 1; R1 = Y; > > P2: Y = 1; R2 = X; > > > > (R1, R2) = (0, 0) is allowed under pure PC, but > > > > P1: X = 1; MFENCE; R1 = Y; > > P2: Y = 1; MFENCE; R2 = X; > > > > (R1, R2) = (0, 0) is NOT allowed. > > Are you sure you are not being inconsistent in example 2 here? > (wrt what you answered yesterday about S/LFENCE). PC implies both LFENCE and SFENCE ordering constraints. I don't think that you've got invalidations stuff entirely accurate, but the basic logic is correct. > > If MFENCE is just an SFENCE+LFENCE, No. SFENCE is store-store barrier and LFENCE is load-load barrier. store-store + load-load != store-load. MFENCE ensures that preceding writes are made globally visible before subsequent reads are performed (store-load barrier)... plus it imposes all other PC ordering constraints (load-load + load-store + store-store). regards, alexander.
From: Alexander Terekhov on 14 Sep 2005 04:07 Hey Mr. andy.glew(a)intel.com, you better fix the specs, really. It's not funny anymore. http://msdn.microsoft.com/msdnmag/issues/05/10/MemoryModels/default.aspx "When multiprocessor systems based on the x86 architecture were being designed, the designers needed a memory model that would make most programs just work, while still allowing the hardware to be reasonably efficient. The resulting specification requires writes from a single processor to remain in order with respect to other writes, but does not constrain reads at all. Unfortunately, a guarantee about write order means nothing if reads are unconstrained. After all, it does not matter that A is written before B if every reader reading B followed by A has reads reordered so that the pre-update value of B and the post-update value of A is seen. The end result is the same: write order seems reversed. Thus, as specified, the x86 model does not provide any stronger guarantees than the ECMA model. It is my belief, however, that the x86 processor actually implements a slightly different memory model than is documented. While this model has never failed to correctly predict behavior in my experiments, and it is consistent with what is publicly known about how the hardware works, it is not in the official specification. New processors might break it." regards, alexander.
From: Joe Seigh on 14 Sep 2005 08:09
Alexander Terekhov wrote: > Hey Mr. andy.glew(a)intel.com, > > you better fix the specs, really. It's not funny anymore. > > http://msdn.microsoft.com/msdnmag/issues/05/10/MemoryModels/default.aspx > It's pretty clear from Andy's comments and from the technical documentation that Intel's technical writers aren't entirely sure who their audience actually is and mix up the specification, which is of interest to programmers, and the implementation, which is of interest to engineers. Andy's last comment, which appeared to me to be about implementation, certainly didn't help. It also doesn't help that Intel has a tradition of not architecting multi-processing support and do it on the fly as Intel adds in multi-processing support, in clear contrast to how other companies have documented multi-processing support in their architectures. You had companies building Intel based multi-processors before Intel even supported multi-processing, which meant the memory model they implemented may or may not have matched what Intel later documented as the official memory model. This is apparently now a tradition and there's a comment to this effect in the Intel documentation. "Also, software should not depend on processor ordering in situations where the system hardware does not support this memory-ordering model." -- Joe Seigh When you get lemons, you make lemonade. When you get hardware, you make software. |