|
From: 'o//'annabee on 7 Mar 2005 00:21 Pý Sun, 06 Mar 2005 23:39:02 GMT, skrev Randall Hyde <randyhyde(a)earthlink.net>: > Hi All, How does this compare ? Hope it is correct, as I was a bit tired when writing it, and havent really tested it fully. On my machine, it seems to run about twice as fast, in theese test, under saturated circumstances. (From Menu) Also, running on mediumlength normal strings, with a "outsticking" char here and there, its seems to run about twice as fast. But anyway.... I am far to tired to confirm if its correctly running. Later perhaps. RandallCmp: Align 4 L0: mov al Býesi mov ah Býedi add esi 1 add edi 1 cmp al 0 | je L7> cmp al ah | je L0< mov bx ax and bx 0_5f5f cmp bl 'A' jb L7> cmp bl 'Z' ja L7> cmp bl bh je L0< L7: cmp al ah ret FasterCmp: Align 4 L0: mov dx wýeax | cmp dh 0 | je L8> | cmp dl 0 | je L9> | add ebx 2 | add eax 2 | cmp dx wýebx-2 | je L0< mov cx wýebx-2 and dx 0_5f5f and cx 0_5f5f add dx 02020 add cx 02020 cmp dx cx | je L0< ret L8: cmp dl býebx | jne L9> cmp býebx+1 0 L9: ret ;Wrote some macros to make the main code easier to read. [CompareF |mov eax String#1 mov ebx String#2 call FasterCmp] [CompareR |mov esi String#1 mov edi String#2 call RandallCmp] [String1:Bý 0 #256] [String2:Bý 0 #256] [String3:Bý 0 #256] [String4:Bý 0 #256] ThisApplication.RunTinkerTest: ;Generate the strings. push esi edi mov eax String1 xor cl cl L0:mov Býeax cl | inc eax | inc cl | jnz L0< mov eax String2 xor cx cx L0:mov Býeax ch | inc eax | mov ch cl | cmp cl 'A' | jb L1> | cmp cl 'Z' | ja L1> | sub ch 020 | L1: | inc cl | jnz L0< mov eax String3 xor cx cx L0:mov Býeax ch | inc eax | mov ch cl | cmp cl 'a' | jb L1> | mov ch cl | cmp cl 'z' | ja L1> | sub ch 020 | L1: | inc cl | jnz L0< mov eax String4 xor cl cl L0:mov Býeax cl | inc eax | inc cl | cmp cl 'a' | jnz L0< ;didnt understand you code here.. so I skipped it. Seems you moved a register, and overwriting a label??? align 16 nop nop nop nop rdtsc | push edx eax compareF 1 1 compareF 1 2 compareF 1 3 compareF 2 3 compareF 3 2 compareF 1 4 rdtsc | pop ebx ecx sub eax ebx | sbb edx ecx HexPrint eax rdtsc | push edx eax compareR 1 1 compareR 1 2 compareR 1 3 compareR 2 3 compareR 3 2 compareR 1 4 rdtsc | pop ebx ecx sub eax ebx | sbb edx ecx HexPrint eax pop edi esi ret -- http://TheWannabee.org
From: Percival on 7 Mar 2005 00:35 On Mon, 07 Mar 2005 06:21:34 +0100, '\\o//'annabee wrote: > Pý Sun, 06 Mar 2005 23:39:02 GMT, skrev Randall Hyde > <randyhyde(a)earthlink.net>: > >> Hi All, > > > How does this compare ? Hope it is correct, as I was a bit tired when > writing it, and havent really tested it fully. On my machine, it seems to > run about twice as fast, in theese test, under saturated circumstances. > (From Menu) Also, running on mediumlength normal strings, with a > "outsticking" char here and there, its seems to run about twice as fast. > > But anyway.... I am far to tired to confirm if its correctly running. > Later perhaps. > > RandallCmp: > Align 4 > L0: > mov al Býesi > mov ah Býedi > add esi 1 > add edi 1 > cmp al 0 | je L7> > cmp al ah | je L0< > mov bx ax > and bx 0_5f5f > cmp bl 'A' > jb L7> > cmp bl 'Z' > ja L7> > cmp bl bh > je L0< > L7: > cmp al ah > ret > > FasterCmp: > Align 4 > L0: mov dx wýeax | cmp dh 0 | je L8> | cmp dl 0 | je L9> | add ebx 2 | > add eax 2 | cmp dx wýebx-2 | je L0< > mov cx wýebx-2 > and dx 0_5f5f > and cx 0_5f5f > add dx 02020 > add cx 02020 > cmp dx cx | je L0< > ret > L8: > cmp dl býebx | jne L9> > cmp býebx+1 0 > L9: > ret > > ;Wrote some macros to make the main code easier to read. > > [CompareF |mov eax String#1 > mov ebx String#2 > call FasterCmp] > > [CompareR |mov esi String#1 > mov edi String#2 > call RandallCmp] > > [String1:Bý 0 #256] > [String2:Bý 0 #256] > [String3:Bý 0 #256] > [String4:Bý 0 #256] > > ThisApplication.RunTinkerTest: > > ;Generate the strings. > > push esi edi > mov eax String1 > xor cl cl > L0:mov Býeax cl | inc eax | inc cl | jnz L0< > mov eax String2 > xor cx cx > L0:mov Býeax ch | inc eax | mov ch cl | cmp cl 'A' | jb L1> | cmp cl 'Z' > | ja L1> | sub ch 020 | L1: | inc cl | jnz L0< > mov eax String3 > xor cx cx > L0:mov Býeax ch | inc eax | mov ch cl | cmp cl 'a' | jb L1> | mov ch cl | > cmp cl 'z' | ja L1> | sub ch 020 | L1: | inc cl | jnz L0< > mov eax String4 > xor cl cl > L0:mov Býeax cl | inc eax | inc cl | cmp cl 'a' | jnz L0< > > ;didnt understand you code here.. so I skipped it. Seems you moved a > register, and overwriting a label??? > > align 16 > nop > nop > nop > nop > > rdtsc | push edx eax > > compareF 1 1 > compareF 1 2 > compareF 1 3 > compareF 2 3 > compareF 3 2 > compareF 1 4 > > rdtsc | pop ebx ecx > > sub eax ebx | sbb edx ecx > > HexPrint eax > > > rdtsc | push edx eax > > compareR 1 1 > compareR 1 2 > compareR 1 3 > compareR 2 3 > compareR 3 2 > compareR 1 4 > > rdtsc | pop ebx ecx > > sub eax ebx | sbb edx ecx > > HexPrint eax > pop edi esi > ret *bets Wannabee is kill filed* --- Forwarding message. Percival
From: arargh502NOSPAM on 7 Mar 2005 00:57 On Sun, 06 Mar 2005 21:13:49 -0600, arargh502NOSPAM(a)NOW.AT.arargh.com wrote: >On Sun, 06 Mar 2005 23:39:02 GMT, "Randall Hyde" ><randyhyde(a)earthlink.net> wrote: > >>Hi All, >> >>I'm currently in the process of correcting a defect in the >>HLA Standard Library case insensitive string comparison >>routines. While I'm at it, I'm trying to speed it up a bit. >>I've written the following code: >> >>procedure stricmp; @nodisplay; @noframe; align(4); >>begin stricmp; >> >> push( ebx ); >> >> >> // Compare the two strings until we encounter a zero byte >> // or until the corresonding characters are different. >> >> cmpLoop: >Probably align to 4 or 16 here >> >> mov( [esi], al ); >> mov( [edi], ah ); >> add( 1, esi ); >> add( 1, edi ); >> cmp( al, 0 ); >> je notAlpha; Also, is the second string [edi] always longer or equal in length to the first? I guess you would get a not equal and exit the loop :-) >> cmp( al, ah ); >> je cmpLoop; >> mov( ax, bx ); > >Why bother to move it? >No matter what, ax is not needed anymore that I can see. > >Also if use32 save a byte by "mov( eax, ebx );" > > >> and( $5f5f, bx ); >Could preload $5f5f into a reg > >> cmp( bl, 'A' ); >> jb notAlpha; >> cmp( bl, 'Z' ); >> ja notAlpha; >What about bh? > >Could combine these two jmps to one using setcc and some more regs >If you process bh that would be 4 jmps to one. > > >> cmp( bl, bh ); >> je cmpLoop; > >If they are not equal is it possible that the cmp below can get an >incorrect result in some cases? > >> >> notAlpha: >> >> // At this point, we've either encountered a zero byte in the source >> // string or we've encountered two bytes that are not the same in >> // the two strings. In either case, return the result of the >> // comparison in the flags. >> >> pop( ebx ); >> cmp( al, ah ); >> ret(); >> >>end stricmp; >> >> >>Any suggestions on speeding it up would be appreciated. Something like: (not tested - probably has errors) Align 4 cmpLoop: mov al,[esi] ; get left char mov ah,[edi] ; get right char add esi, 1 ; bump left pointer add edi, 1 ; bump right pointer test ah,al ; either Zero? jz notAlpha ; yes - all done cmp ah,al ; same? je cmpLoop ; yes - continue ; if either a lower case letter, make both upper cmp al, 97 setge bl cmp al, 122 setle bh and bl, bh cmp ah, 97 setge cl cmp ah, 122 setle ch and cl, ch or cl, bl jnz notAlpha and ax, 05F5FH cmp ah,al je cmpLoop notAlpha: -- Arargh502 at [drop the 'http://www.' from ->] http://www.arargh.com BCET Basic Compiler Page: http://www.arargh.com/basic/index.html To reply by email, remove the garbage from the reply address.
From: 'o//'annabee on 7 Mar 2005 11:07 Pý Mon, 07 Mar 2005 06:21:34 +0100, skrev '\\o//'annabee <'\\o//'annabee>: Also, my generations of the strings is wrong. Heres a new version of those. (But the speed is still a little more then twice as fast) mov eax String1 mov cl 1 L0:mov Býeax cl | inc eax | inc cl | jnz L0< mov eax String1 mov ebx String2 L0:mov cl Býeax inc eax cmp cl 'a' | jb L1> cmp cl 'z' | ja L1> sub cl 020 L1: mov Býebx cl inc ebx cmp Býeax 0 | jnz L0< mov eax String1 mov ebx String3 L0:mov cl Býeax inc eax cmp cl 'A' | jb L1> cmp cl 'Z' | ja L1> or cl 020 L1: mov Býebx cl inc ebx cmp Býeax 0 | jnz L0< mov eax String4 mov cl 1 L0:mov Býeax cl | inc eax | inc cl | cmp cl 'a' | jnz L0< -- http://TheWannabee.org
From: Octavio on 7 Mar 2005 19:43
"Randall Hyde" <randyhyde(a)earthlink.net> wrote in message news:<qwMWd.2584$oO4.1429(a)newsread3.news.pas.earthlink.net>... > I'm also wondering about the results to return when the strings > are not equal. In particular, the algorithm I've provided compares > the original characters and sets the flags if those characters do > not match. Therefore, 'a' > 'B'. I'm wondering if this is reasonable, > or if I should return the comparisions of the "converted" characters > (if they were uppercase)? I don't really care what *other* languages > (e.g., C/C++) do, I'm interested in a discussion of what is reasonable > to do here. I think is reasonable to compare the converted characters, and also a table lookup must be used to support chars like 'ýý'. About the speedup, i think it must be done elsewere, the table lookup can not be optimized very much. |