|
Prev: Skybuck presents KeepLowBits and KeepHighBits
Next: Skybuck presents ShiftLeft( Left, Right, Shift ) and ShiftRight( Right, Left, Shift )
From: Skybuck Flying on 6 May 2008 15:50 Ok, I wasn't statisfied with all the slow routines. I probably need something faster. Maybe if I clear the buffer up front and then simply place everything sequentially it will go faster and fast enough for my purposes. Actually this is all I really need at the moment. But if I do need something special I can always use the previous routines if I need to overwrite something later on without disturbing any bits. So here is the lightning fast version. It simply adds bits to the end, at least that's how it's supposed to be used. DestBitIndex should "point" the last bit in the buffer. The bits parameter is removed, since it is not relevant anymore. This routine is dirty... it will overwrite the next longword in the buffer as well... but that shouldn't be a problem if it's large enough which is within specifications :) Only 19 instructions, nice and fast ! ;) :) If it's inlined it might be even faster/less instructions =D HAHA ;) // Skybuck's Lightning Fast WriteLongwordBits version. // Classification A1B1. // This means: A1 means routine assumes buffer is cleared. // This means: B1 means routine assumes trailing bits is not a problem. // Bits parameter is no longer relevant. // just 19 instructions // ok I like this routine much better... I can get away with it... procedure WriteLongwordBitsA1B1( Value : longword; DestAddress : pointer; DestBitIndex : longword ); begin longword(DestAddress) := longword(DestAddress) + (DestBitIndex shr 3); // div 8 // DestBitIndex will now function as the shift. DestBitIndex := DestBitIndex and 7; // mod 8 Plongword(DestAddress)^ := Plongword(DestAddress)^ or (Value shl DestBitIndex); Plongword(longword(DestAddress) + 4)^ := Plongword(longword(DestAddress) + 4)^ or (Value shr (32-DestBitIndex)); end; // Generated Assembler: { Project1.dpr.1962: begin 00409058 53 push ebx 00409059 56 push esi 0040905A 8BD9 mov ebx,ecx Project1.dpr.1963: longword(DestAddress) := longword(DestAddress) + (DestBitIndex shr 3); // div 8 0040905C 8BCB mov ecx,ebx 0040905E C1E903 shr ecx,$03 00409061 01CA add edx,ecx Project1.dpr.1966: DestBitIndex := DestBitIndex and 7; // mod 8 00409063 83E307 and ebx,$07 Project1.dpr.1968: Plongword(DestAddress)^ := Plongword(DestAddress)^ or (Value shl DestBitIndex); 00409066 8BCB mov ecx,ebx 00409068 8BF0 mov esi,eax 0040906A D3E6 shl esi,cl 0040906C 0932 or [edx],esi Project1.dpr.1969: Plongword(longword(DestAddress) + 4)^ := Plongword(longword(DestAddress) + 4)^ or (Value shr (32-DestBitIndex)); 0040906E B920000000 mov ecx,$00000020 00409073 2BCB sub ecx,ebx 00409075 D3E8 shr eax,cl 00409077 83C204 add edx,$04 0040907A 0902 or [edx],eax Project1.dpr.1970: end; 0040907C 5E pop esi 0040907D 5B pop ebx 0040907E C3 ret } You see if I scrap some requirements I can write lightning fast code which is even more correct than the competition at the moment ! ;) :) Bye, Skybuck.
From: Skybuck Flying on 6 May 2008 15:53
Yup here is an inlined assembly example: // Generated Assembler: // only 14 instructions ! Project1.dpr.2073: WriteLongwordBitsA1B1( Value, @Buffer, BitPointer ); 0040A207 8BD7 mov edx,edi 0040A209 8BC8 mov ecx,eax 0040A20B C1E903 shr ecx,$03 0040A20E 01CA add edx,ecx 0040A210 83E007 and eax,$07 0040A213 8BC8 mov ecx,eax 0040A215 8BF3 mov esi,ebx 0040A217 D3E6 shl esi,cl 0040A219 0932 or [edx],esi 0040A21B B920000000 mov ecx,$00000020 0040A220 2BC8 sub ecx,eax 0040A222 D3EB shr ebx,cl 0040A224 83C204 add edx,$04 0040A227 091A or [edx],ebx Bye, Bye, Skybuck ;) =D |