From: 'o//'annabee on
Pý Sun, 06 Mar 2005 23:39:02 GMT, skrev Randall Hyde
<randyhyde(a)earthlink.net>:

> Hi All,


How does this compare ? Hope it is correct, as I was a bit tired when
writing it, and havent really tested it fully. On my machine, it seems to
run about twice as fast, in theese test, under saturated circumstances.
(From Menu) Also, running on mediumlength normal strings, with a
"outsticking" char here and there, its seems to run about twice as fast.

But anyway.... I am far to tired to confirm if its correctly running.
Later perhaps.

RandallCmp:
Align 4
L0:
mov al Býesi
mov ah Býedi
add esi 1
add edi 1
cmp al 0 | je L7>
cmp al ah | je L0<
mov bx ax
and bx 0_5f5f
cmp bl 'A'
jb L7>
cmp bl 'Z'
ja L7>
cmp bl bh
je L0<
L7:
cmp al ah
ret

FasterCmp:
Align 4
L0: mov dx wýeax | cmp dh 0 | je L8> | cmp dl 0 | je L9> | add ebx 2 |
add eax 2 | cmp dx wýebx-2 | je L0<
mov cx wýebx-2
and dx 0_5f5f
and cx 0_5f5f
add dx 02020
add cx 02020
cmp dx cx | je L0<
ret
L8:
cmp dl býebx | jne L9>
cmp býebx+1 0
L9:
ret

;Wrote some macros to make the main code easier to read.

[CompareF |mov eax String#1
mov ebx String#2
call FasterCmp]

[CompareR |mov esi String#1
mov edi String#2
call RandallCmp]

[String1:Bý 0 #256]
[String2:Bý 0 #256]
[String3:Bý 0 #256]
[String4:Bý 0 #256]

ThisApplication.RunTinkerTest:

;Generate the strings.

push esi edi
mov eax String1
xor cl cl
L0:mov Býeax cl | inc eax | inc cl | jnz L0<
mov eax String2
xor cx cx
L0:mov Býeax ch | inc eax | mov ch cl | cmp cl 'A' | jb L1> | cmp cl 'Z'
| ja L1> | sub ch 020 | L1: | inc cl | jnz L0<
mov eax String3
xor cx cx
L0:mov Býeax ch | inc eax | mov ch cl | cmp cl 'a' | jb L1> | mov ch cl |
cmp cl 'z' | ja L1> | sub ch 020 | L1: | inc cl | jnz L0<
mov eax String4
xor cl cl
L0:mov Býeax cl | inc eax | inc cl | cmp cl 'a' | jnz L0<

;didnt understand you code here.. so I skipped it. Seems you moved a
register, and overwriting a label???

align 16
nop
nop
nop
nop

rdtsc | push edx eax

compareF 1 1
compareF 1 2
compareF 1 3
compareF 2 3
compareF 3 2
compareF 1 4

rdtsc | pop ebx ecx

sub eax ebx | sbb edx ecx

HexPrint eax


rdtsc | push edx eax

compareR 1 1
compareR 1 2
compareR 1 3
compareR 2 3
compareR 3 2
compareR 1 4

rdtsc | pop ebx ecx

sub eax ebx | sbb edx ecx

HexPrint eax
pop edi esi
ret


--
http://TheWannabee.org


From: Percival on
On Mon, 07 Mar 2005 06:21:34 +0100, '\\o//'annabee wrote:

> Pý Sun, 06 Mar 2005 23:39:02 GMT, skrev Randall Hyde
> <randyhyde(a)earthlink.net>:
>
>> Hi All,
>
>
> How does this compare ? Hope it is correct, as I was a bit tired when
> writing it, and havent really tested it fully. On my machine, it seems to
> run about twice as fast, in theese test, under saturated circumstances.
> (From Menu) Also, running on mediumlength normal strings, with a
> "outsticking" char here and there, its seems to run about twice as fast.
>
> But anyway.... I am far to tired to confirm if its correctly running.
> Later perhaps.
>
> RandallCmp:
> Align 4
> L0:
> mov al Býesi
> mov ah Býedi
> add esi 1
> add edi 1
> cmp al 0 | je L7>
> cmp al ah | je L0<
> mov bx ax
> and bx 0_5f5f
> cmp bl 'A'
> jb L7>
> cmp bl 'Z'
> ja L7>
> cmp bl bh
> je L0<
> L7:
> cmp al ah
> ret
>
> FasterCmp:
> Align 4
> L0: mov dx wýeax | cmp dh 0 | je L8> | cmp dl 0 | je L9> | add ebx 2 |
> add eax 2 | cmp dx wýebx-2 | je L0<
> mov cx wýebx-2
> and dx 0_5f5f
> and cx 0_5f5f
> add dx 02020
> add cx 02020
> cmp dx cx | je L0<
> ret
> L8:
> cmp dl býebx | jne L9>
> cmp býebx+1 0
> L9:
> ret
>
> ;Wrote some macros to make the main code easier to read.
>
> [CompareF |mov eax String#1
> mov ebx String#2
> call FasterCmp]
>
> [CompareR |mov esi String#1
> mov edi String#2
> call RandallCmp]
>
> [String1:Bý 0 #256]
> [String2:Bý 0 #256]
> [String3:Bý 0 #256]
> [String4:Bý 0 #256]
>
> ThisApplication.RunTinkerTest:
>
> ;Generate the strings.
>
> push esi edi
> mov eax String1
> xor cl cl
> L0:mov Býeax cl | inc eax | inc cl | jnz L0<
> mov eax String2
> xor cx cx
> L0:mov Býeax ch | inc eax | mov ch cl | cmp cl 'A' | jb L1> | cmp cl 'Z'
> | ja L1> | sub ch 020 | L1: | inc cl | jnz L0<
> mov eax String3
> xor cx cx
> L0:mov Býeax ch | inc eax | mov ch cl | cmp cl 'a' | jb L1> | mov ch cl |
> cmp cl 'z' | ja L1> | sub ch 020 | L1: | inc cl | jnz L0<
> mov eax String4
> xor cl cl
> L0:mov Býeax cl | inc eax | inc cl | cmp cl 'a' | jnz L0<
>
> ;didnt understand you code here.. so I skipped it. Seems you moved a
> register, and overwriting a label???
>
> align 16
> nop
> nop
> nop
> nop
>
> rdtsc | push edx eax
>
> compareF 1 1
> compareF 1 2
> compareF 1 3
> compareF 2 3
> compareF 3 2
> compareF 1 4
>
> rdtsc | pop ebx ecx
>
> sub eax ebx | sbb edx ecx
>
> HexPrint eax
>
>
> rdtsc | push edx eax
>
> compareR 1 1
> compareR 1 2
> compareR 1 3
> compareR 2 3
> compareR 3 2
> compareR 1 4
>
> rdtsc | pop ebx ecx
>
> sub eax ebx | sbb edx ecx
>
> HexPrint eax
> pop edi esi
> ret

*bets Wannabee is kill filed* --- Forwarding message.

Percival
From: arargh502NOSPAM on
On Sun, 06 Mar 2005 21:13:49 -0600, arargh502NOSPAM(a)NOW.AT.arargh.com
wrote:

>On Sun, 06 Mar 2005 23:39:02 GMT, "Randall Hyde"
><randyhyde(a)earthlink.net> wrote:
>
>>Hi All,
>>
>>I'm currently in the process of correcting a defect in the
>>HLA Standard Library case insensitive string comparison
>>routines. While I'm at it, I'm trying to speed it up a bit.
>>I've written the following code:
>>
>>procedure stricmp; @nodisplay; @noframe; align(4);
>>begin stricmp;
>>
>> push( ebx );
>>
>>
>> // Compare the two strings until we encounter a zero byte
>> // or until the corresonding characters are different.
>>
>> cmpLoop:
>Probably align to 4 or 16 here
>>
>> mov( [esi], al );
>> mov( [edi], ah );
>> add( 1, esi );
>> add( 1, edi );
>> cmp( al, 0 );
>> je notAlpha;
Also, is the second string [edi] always longer or equal in length to
the first?

I guess you would get a not equal and exit the loop :-)

>> cmp( al, ah );
>> je cmpLoop;
>> mov( ax, bx );
>
>Why bother to move it?
>No matter what, ax is not needed anymore that I can see.
>
>Also if use32 save a byte by "mov( eax, ebx );"
>
>
>> and( $5f5f, bx );
>Could preload $5f5f into a reg
>
>> cmp( bl, 'A' );
>> jb notAlpha;
>> cmp( bl, 'Z' );
>> ja notAlpha;
>What about bh?
>
>Could combine these two jmps to one using setcc and some more regs
>If you process bh that would be 4 jmps to one.
>
>
>> cmp( bl, bh );
>> je cmpLoop;
>
>If they are not equal is it possible that the cmp below can get an
>incorrect result in some cases?
>
>>
>> notAlpha:
>>
>> // At this point, we've either encountered a zero byte in the source
>> // string or we've encountered two bytes that are not the same in
>> // the two strings. In either case, return the result of the
>> // comparison in the flags.
>>
>> pop( ebx );
>> cmp( al, ah );
>> ret();
>>
>>end stricmp;
>>
>>
>>Any suggestions on speeding it up would be appreciated.

Something like: (not tested - probably has errors)

Align 4

cmpLoop:
mov al,[esi] ; get left char
mov ah,[edi] ; get right char

add esi, 1 ; bump left pointer
add edi, 1 ; bump right pointer

test ah,al ; either Zero?
jz notAlpha ; yes - all done

cmp ah,al ; same?
je cmpLoop ; yes - continue

; if either a lower case letter, make both upper

cmp al, 97
setge bl
cmp al, 122
setle bh
and bl, bh

cmp ah, 97
setge cl
cmp ah, 122
setle ch
and cl, ch
or cl, bl

jnz notAlpha

and ax, 05F5FH

cmp ah,al
je cmpLoop

notAlpha:

--
Arargh502 at [drop the 'http://www.' from ->] http://www.arargh.com
BCET Basic Compiler Page: http://www.arargh.com/basic/index.html

To reply by email, remove the garbage from the reply address.
From: 'o//'annabee on
Pý Mon, 07 Mar 2005 06:21:34 +0100, skrev '\\o//'annabee <'\\o//'annabee>:

Also, my generations of the strings is wrong.
Heres a new version of those.

(But the speed is still a little more then twice as fast)

mov eax String1
mov cl 1
L0:mov Býeax cl | inc eax | inc cl | jnz L0<
mov eax String1
mov ebx String2

L0:mov cl Býeax
inc eax
cmp cl 'a' | jb L1>
cmp cl 'z' | ja L1>
sub cl 020
L1:
mov Býebx cl
inc ebx
cmp Býeax 0 | jnz L0<


mov eax String1
mov ebx String3
L0:mov cl Býeax
inc eax
cmp cl 'A' | jb L1>
cmp cl 'Z' | ja L1>
or cl 020
L1:
mov Býebx cl
inc ebx
cmp Býeax 0 | jnz L0<


mov eax String4
mov cl 1
L0:mov Býeax cl | inc eax | inc cl | cmp cl 'a' | jnz L0<

--
http://TheWannabee.org


From: Octavio on
"Randall Hyde" <randyhyde(a)earthlink.net> wrote in message news:<qwMWd.2584$oO4.1429(a)newsread3.news.pas.earthlink.net>...
> I'm also wondering about the results to return when the strings
> are not equal. In particular, the algorithm I've provided compares
> the original characters and sets the flags if those characters do
> not match. Therefore, 'a' > 'B'. I'm wondering if this is reasonable,
> or if I should return the comparisions of the "converted" characters
> (if they were uppercase)? I don't really care what *other* languages
> (e.g., C/C++) do, I'm interested in a discussion of what is reasonable
> to do here.
I think is reasonable to compare the converted characters, and also a
table lookup must be used to support chars like 'ýý'.
About the speedup, i think it must be done elsewere, the table lookup
can not be optimized very much.