|
From: Skybuck Flying on 4 May 2008 12:56 Hello, Here is my fast (latency 3) get bit routine: Enjoy ! ;) =D // *** Begin of Code *** program Project1; {$APPTYPE CONSOLE} { Skybuck presents FastGetBit version 0.01 created on 4 may 2008 by Skybuck Flying. This routine has just 3 latency on AMD X2 3800+. Ofcourse it also has some call and ret latency as usual. It is limited to bit positions 0 to 31. Could still be a nice and interesting function or just asm code demonstration when having to deal with these kinds of limited situations ! ;) =D I wish Delphi supported something like this in high level language ! Would be nice ! =D and fast too ! ;) :) Currentl Delphi compiler is limited to "test" and "set byte" instruction for set of bit enum. which kinda sux anyway... special/fast bit operator/indexing would be nice. } uses SysUtils; // returns true (1) if bit position set, or false(0) if bit position not set. // 3 latency plus ofcourse call + ret latency ;) // bit position range: 0 to 31 function FastGetBit( Value : longword; BitPosition : longword ) : boolean; asm bt eax, edx // latency: 1 mov eax, 0 // latency: 1 adc eax, 0 // latency: 1 end; // when you just wanna get 0 and 1 and not a boolean. function FastGetBitInt( Value : longword; BitPosition : longword ) : longword; asm bt eax, edx // latency: 1 mov eax, 0 // latency: 1 adc eax, 0 // latency: 1 end; procedure Main; var B : boolean; L : longword; begin B := false; writeln( longword( B ) ); // 0 B := true; writeln( longword( B ) ); // 1 L := 2147483648; if FastGetBit( L, 31 ) then begin writeln( 'Bit Position is set'); end; end; begin try Main; except on E:Exception do Writeln(E.Classname, ': ', E.Message); end; readln; end. // *** End of Code *** Bye, Skybuck.
From: Wojciech Muła on 4 May 2008 13:09 "Skybuck Flying" <BloodyShame(a)hotmail.com> wrote: > function FastGetBit( Value : longword; BitPosition : longword ) : boolean; > asm > bt eax, edx // latency: 1 > mov eax, 0 // latency: 1 > adc eax, 0 // latency: 1 > end; You can replace mov/adc with single setc eax -- this instruction has 1 cycle latency on modern CPUs. w.
From: Skybuck Flying on 4 May 2008 13:23 "Wojciech Mula" <wojciech_mula(a)poczta.null.onet.pl.invalid> wrote in message news:20080504190913.938d1fff.wojciech_mula(a)poczta.null.onet.pl.invalid... > "Skybuck Flying" <BloodyShame(a)hotmail.com> wrote: > >> function FastGetBit( Value : longword; BitPosition : longword ) : >> boolean; >> asm >> bt eax, edx // latency: 1 >> mov eax, 0 // latency: 1 >> adc eax, 0 // latency: 1 >> end; > > You can replace mov/adc with single setc eax -- this instruction > has 1 cycle latency on modern CPUs. No, there is a little problem with that solution. setxx only sets a single byte. The delphi 2007 compiler uses the setxx solution and for some reason it is forced to output: "and 127" as well. Which is an extra instruction. It's probably better to avoid working with bytes, because Delphi likes adding: "movzx eax, al" all over the place ;) :) <- which are extra instructions :( Bye, Skybuck.
From: Robert Redelmeier on 4 May 2008 14:42 In alt.lang.asm Wojciech Mu?a <wojciech_mula(a)poczta.null.onet.pl.invalid> wrote in part: > "Skybuck Flying" <BloodyShame(a)hotmail.com> wrote: >> function FastGetBit( Value : longword; BitPosition : longword ) : boolean; >> asm >> bt eax, edx // latency: 1 >> mov eax, 0 // latency: 1 >> adc eax, 0 // latency: 1 >> end; > > You can replace mov/adc with single setc eax -- this > instruction has 1 cycle latency on modern CPUs. If you can tolerate additional bits set, try: SBB eax, eax This is likely over-optimizing -- unless inlined as part of a larger routine, the control transfer (and any prolog/epilog) will eat more than a few clocks. -- Robert
From: Wojciech Muła on 4 May 2008 15:08 "Skybuck Flying" <BloodyShame(a)hotmail.com> wrote: > > You can replace mov/adc with single setc eax -- this instruction > > has 1 cycle latency on modern CPUs. > > No, there is a little problem with that solution. > > setxx only sets a single byte. Sorry, I was sure that setxx accept 32-bit registers. However if BitPostion lie in range 0..31 or even 0..255 you can use setc instruction. w.
|
Next
|
Last
Pages: 1 2 Prev: Wanna do a WriteLongwordBits contest ? Next: Patching of a divide overflow error ? |