|
Prev: Fast way of splitting an image into bit planes ?
Next: Skybuck presents FastGetBit ( Value, BitPosition ) (latency3)
From: Skybuck Flying on 4 May 2008 16:30 "Robert Redelmeier" <redelm(a)ev1.net.invalid> wrote in message news:NsnTj.1323$To6.47(a)newssvr21.news.prodigy.net... > In alt.lang.asm Wojciech Mu?a <wojciech_mula(a)poczta.null.onet.pl.invalid> > wrote in part: >> "Skybuck Flying" <BloodyShame(a)hotmail.com> wrote: >>> function FastGetBit( Value : longword; BitPosition : longword ) : >>> boolean; >>> asm >>> bt eax, edx // latency: 1 >>> mov eax, 0 // latency: 1 >>> adc eax, 0 // latency: 1 >>> end; >> >> You can replace mov/adc with single setc eax -- this >> instruction has 1 cycle latency on modern CPUs. > > If you can tolerate additional bits set, try: > > SBB eax, eax > > This is likely over-optimizing -- unless inlined as part of > a larger routine, the control transfer (and any prolog/epilog) > will eat more than a few clocks. > > -- Robert > > Yes, I was wondering about what would happen if sbb was used, I didn't dare it though. So I focused on an implementation I knew would be 100% correct. Now it's time to look at alternative implementations. And the sbb is actually quite cool ! Since Delphi has alternative booleans like ByteBool and WordBool. ByteBool and WordBool are very tolerant, if any bit is set then it qualifies as true. However for the main boolean type: Boolean, only zero and one qualify if I am not mistaken. However the sbb has an interesting property... it always sets all bits to one if there is a carry... so bit 0 is always one when there is a carry. I guess Delphi only checks bit zero for the boolean type so the sbb will work for the boolean type as well ! So that's quite funny, amazing and interesting ! ;) However ofcourse... if booleans are typecasted to longwords or whatever then one must be carefull but I can handle that... At least the branches should all work as normal ;) So to make a long story short... sbb is a pretty good optimization ! LOL. Now only 2 instructions left ! LOL 33% speed up ! LOL. Lovely ! ;) =D Here is an example: // *** Begin of Updated Example *** program Project1; {$APPTYPE CONSOLE} { Skybuck presents FastGetBit version 0.01 created on 4 may 2008 by Skybuck Flying. This routine has just 3 latency on AMD X2 3800+. Ofcourse it also has some call and ret latency as usual. It is limited to bit positions 0 to 31. Could still be a nice and interesting function or just asm code demonstration when having to deal with these kinds of limited situations ! ;) =D I wish Delphi supported something like this in high level language ! Would be nice ! =D and fast too ! ;) :) Currentl Delphi compiler is limited to "test" and "set byte" instruction for set of bit enum. which kinda sux anyway... special/fast bit operator/indexing would be nice. Updated: Now only 2 latency thanks to SBB wow ! Nice see below: :) } uses SysUtils; // returns true (1) if bit position set, or false(0) if bit position not set. // 3 latency plus ofcourse call + ret latency ;) // bit position range: 0 to 31 function FastGetBit( Value : longword; BitPosition : longword ) : boolean; asm bt eax, edx // latency: 1 mov eax, 0 // latency: 1 adc eax, 0 // latency: 1 end; // when you just wanna get 0 and 1 and not a boolean. function FastGetBitInt( Value : longword; BitPosition : longword ) : longword; asm bt eax, edx // latency: 1 mov eax, 0 // latency: 1 adc eax, 0 // latency: 1 end; // Boolean types only look at bit zero, for false or true. // Bytebool, wordbool and such are more tolerant and will be true if any bit is set. // These functions work for all boolean types since sbb always sets all bits to one // if there is a carry. // The int function always returns all bits set for true, so the biggest value. function FastGetBitV2Boolean( Value : longword; BitPosition : longword ) : boolean; asm bt eax, edx // latency: 1 sbb eax, eax // latency: 1 end; function FastGetBitV2ByteBool( Value : longword; BitPosition : longword ) : bytebool; asm bt eax, edx // latency: 1 sbb eax, eax // latency: 1 end; function FastGetBitV2Int( Value : longword; BitPosition : longword ) : longword; asm bt eax, edx // latency: 1 sbb eax, eax // latency: 1 end; procedure Main; var B : boolean; B2 : ByteBool; L : longword; begin B := false; writeln( longword( B ) ); // 0 B := true; writeln( longword( B ) ); // 1 L := 2147483648; if FastGetBit( L, 31 ) then begin writeln( 'FastGetBit: Bit Position is set'); end; if FastGetBitV2Boolean( L, 31 ) then begin writeln( 'FastGetBitV2Boolean: Bit Position is set'); end; if FastGetBitV2ByteBool( L, 31 ) then begin writeln( 'FastGetBitV2ByteBool: Bit Position is set'); end; writeln( 'FastGetBitV2Int: ', FastGetBitV2Int( L, 31 ) ); L := 12345; writeln( 'FastGetBitV2Int: ', FastGetBitV2Int( L, 0 ) ); end; begin try Main; except on E:Exception do Writeln(E.Classname, ': ', E.Message); end; readln; end. // *** End of Updated Example *** Thank you, and Bye, Skybuck ;) :) |