|
Prev: How many branches in a loop can be predicted successfully ?
Next: Wanna do a WriteLongwordBits contest ?
From: Skybuck Flying on 4 May 2008 10:54 Hello, Suppose I write x86 code like so: bt eax, 0 adc [edx], 0 bt eax, 1 adc [edx + 4], 0 bt eax, 2 adc [edx + 8], 0 The bt (bit test) instruction sets the carry flag if the bit position is set, otherwise the carry flag is cleared. The adc instruction adds the carry flag. Then this instruction pair is repeated multiple times as shown above. (Slightly altered by an offset: +4, +8, etc) I read about how cpu's can execute multiple integer instructions at the same time, which makes me wonder. Do cpu's nowadays have multiple carry flags underneath ? I would think so... otherwise how can they possible execute multiple integer instructions ? So my question is: Can the instructions above be executed in parallel/at the same time ? Bye, Skybuck.
From: MitchAlsup on 4 May 2008 13:45 On May 4, 8:54 am, "Skybuck Flying" <BloodySh...(a)hotmail.com> wrote: > I read about how cpu's can execute multiple integer instructions at the same > time, which makes me wonder. > > Do cpu's nowadays have multiple carry flags underneath ? I would think so.... > otherwise how can they possible execute multiple integer instructions ? Todays CPUs obey restricted data-flow semantics. Thus, the ADC instruction is dependent upon the BT instruction, and is scheduled after that instruction executes. EFLAG fields are forwarded just like any other register. Athlon and Opteron manage EFLAGs as 3 independent fields, C, O, and ZAPS, to give maximum flexibility to avoid conmdition code dependencies that are not actually necessary. I believe that Intel CPUs manage EFLAGs as 2 independent registers but this is not a firm belief, and tehey have more implementations to consider. Secondarily, Athlon and Opteron can have several EFLAGs manipulations in flight simultaneously, just like several manipulations of EAX can be in flight simultaneously. The write-back logic at the end of the pipe puts all this stuff back into the EFLAGs rregister we know and love.
From: Robert Redelmeier on 4 May 2008 14:53 In alt.lang.asm Skybuck Flying <BloodyShame(a)hotmail.com> wrote in part: > Do cpu's nowadays have multiple carry flags underneath ? I > would think so... otherwise how can they possible execute > multiple integer instructions ? Yes, AFAIK the modern CPUs do register renaming on flags. Otherwise, as you point out, parallelism stalls. The problems come with instructions that only update some of the flags (like INC), or where you create a dependency chains (like your BT/ADC) without independant filler. Your BT/ADC X, BT/ADC Y, BT/ADC Z will be reordered and interally executed as: BT X [flag0] BT Y [flag1] BT Z [flag2] ADC X [flag0] ADC Y [flag1] ADC Z [flag2] To allow multiple instructions running per clock. Actually it is more complex than this, because your ADCs are actually load, add, and store micro-ops. -- Robert
From: Rod Pemberton on 5 May 2008 03:45 "Skybuck Flying" <BloodyShame(a)hotmail.com> wrote in message news:4f5a4$481dccc4$541983fa$24136(a)cache3.tilbu1.nb.home.nl... > Hello, > > Suppose I write x86 code like so: > > bt eax, 0 > adc [edx], 0 > > bt eax, 1 > adc [edx + 4], 0 > > bt eax, 2 > adc [edx + 8], 0 > Uh oh, he found the BT instruction... Yet, I'm not sure you found an assembly solution for your bit-planes problem. You're warmer. Did you? Do you see one? It's similar to what you just posted... Skybuck Crashing, why haven't you learned assembly yet? or, C for that matter? With all the Delphi code you (or mostly others for free for you) have converted to assembly, haven't you proven to yourself that Delphi is totally worthless? What _are_ you doing with all that code - writing new libraries for Delphi? Writing a Delphi compatible compiler?!? Anyway, think about repeated BT, RCR. It's likely to be slower (non-pairable) than a solution using shift, and, or, etc. But, it should be easier to code since it has far less complexity to extract and reorder bits. I was hoping someone like Terje would respond on that one, since I wanted to see solutions other than what I could come up with. But, it seems your persistent insanity got a responce to another one... Rod Pemberton
From: Skybuck Flying on 5 May 2008 15:35 Hmm, You did give me an interesting idea. I ported my WriteLongwordBits SimInt64 Delphi 2007 version to Visual Studio C/C++ 2008, to compare assembler outputs. See other thread about that ;) :) Bye, Skybuck.
|
Next
|
Last
Pages: 1 2 Prev: How many branches in a loop can be predicted successfully ? Next: Wanna do a WriteLongwordBits contest ? |