|
Prev: Parallel execution of carry dependent instructions ?
Next: Skybuck presents FastGetBit ( Value, BitPosition ) (latency 3)
From: Skybuck Flying on 5 May 2008 09:39 And then there is another nice bug. The value which you are or-ing in the else branch needs to be "chopped off". Otherwise there could be accessive bits in the value which are not supposed to be written. Which is a pretty logical requirement. There could be anything in the value ;) So now you starting to see why I posted another thread called: "getting rid of accessive bits" HAHAHAHAHA LOL =D Jajajajajaj. Though maybe your code has potential after all the bugs fixed. Though maybe later I write a nice "ror" based assembler version which could make a nice justification for adding "ror" and "rol" support to high level languages ;) :) Bye, Skybuck.
From: Skybuck Flying on 5 May 2008 09:46 Ok, I give up fixing your code. You'll have to fix it yourself. I will now focus on an assembler version with a different algorithm, maybe my own slightly modified don't know yet ;) Bye, Skybuck.
From: piotr.wyderski on 5 May 2008 17:49 Terje Mathisen wrote: > Your example code is simply horrible To say the least :-) Anyway, he seems to be unable to think the SIMDish way, which should be a real help here. Just grab the entire vector of data, PAND it with a proper mask (respectively to the chosen bit), non-linearly scale the result by low (or high) multiplying it with 0x000100020004... etc. and combine the whole stuff with PSADBW. Unroll everything in order to hide the latencies and voila. Unfortunately, there is no PMULLB, so the implementation becomes a bit tricky... :-) Best regards Piotr Wyderski
From: Terje Mathisen on 6 May 2008 02:31 Skybuck Flying wrote: > I'll shall help you understand your flaw and the problem in general: > > Value (shr 32 - DestBitIndex) won't work if DestBitIndex is zero. > > The shr instruction is limited to a range of 0 to 31. > > You are trying to shift with 32 which means the shr won't happen. > > Good luck with fixing it. Good luck with rereading my code and figuring out why this cannot ever happen. :-) Terje -- - <Terje.Mathisen(a)hda.hydro.com> "almost all programming can be viewed as an exercise in caching"
From: Skybuck Flying on 6 May 2008 04:41
"Terje Mathisen" <terje.mathisen(a)hda.hydro.com> wrote in message news:3YadnRguDqj9ZoLV4p2dnAA(a)giganews.com... > Skybuck Flying wrote: >> I'll shall help you understand your flaw and the problem in general: >> >> Value (shr 32 - DestBitIndex) won't work if DestBitIndex is zero. >> >> The shr instruction is limited to a range of 0 to 31. >> >> You are trying to shift with 32 which means the shr won't happen. >> >> Good luck with fixing it. > > Good luck with rereading my code and figuring out why this cannot ever > happen. :-) Even if it never happens, there are many other bugs in your code which do happen, so I am done with your code ;) Bye, Skybuck. |