|
From: Frank Kotler on 7 Mar 2005 19:41 arargh502NOSPAM(a)NOW.AT.arargh.com wrote: .... > > mov( ax, bx ); > > Why bother to move it? > No matter what, ax is not needed anymore that I can see. I thought that was a really good idea, but I ran into subtle problems - if "c1" and "c2" are non-alpha, but differ by 20h, the final cmp can give a "false positive" if the "converted" comparison is returned. If we delay the "conversion" until after the "alpha" comparison, it works, but we have to test for both "a - z" and "A - Z", or check both al and ah. Haven't tested this, but I doubt if it's faster. > Also if use32 save a byte by "mov( eax, ebx );" This definitely helps (and HLA is *always* use32). > > and( $5f5f, bx ); > Could preload $5f5f into a reg To my surprise, this seems slower... Using the 32-bit reg helps, though. > > > cmp( bl, 'A' ); > > jb notAlpha; > > cmp( bl, 'Z' ); > > ja notAlpha; > What about bh? I don't think we need to care about bh - if it isn't the same "alphaness" as bl, we haven't got a match anyway. > Could combine these two jmps to one using setcc and some more regs > If you process bh that would be 4 jmps to one. I haven't tried the code you posted using this method... > > cmp( bl, bh ); > > je cmpLoop; > > If they are not equal is it possible that the cmp below can get an > incorrect result in some cases? I think it's okay... Returns the "unconverted" flags - that is, 'a' > 'B'. Since we're doing insensitive, and considering 'a' to be equivalent to 'A', it might make sense to return the "converted" comparison, so 'B' > 'a'... Just "converting" eax at this point should do that safely... I think... OTOH, the caller usually cares about match or not. If the caller cares which is greater, it's probably for a "binary search", and the return ought to match the way the data is sorted... which might favor the "unconverted" comparison... > > notAlpha: > > > > // At this point, we've either encountered a zero byte in the source > > // string or we've encountered two bytes that are not the same in > > // the two strings. In either case, return the result of the > > // comparison in the flags. > > > > pop( ebx ); > > cmp( al, ah ); > > ret(); > > > >end stricmp; Best, Frank
From: Randall Hyde on 8 Mar 2005 00:08 "Frank Kotler" <fbkotler(a)comcast.net> wrote in message news:422CFD3F.5AA1F13D(a)comcast.net... > > I'm not surprised that it's faster, since you're processing > two bytes at once. I'm not sure it's really "safe" to do. > What if the terminating zero butts right up against the end > of memory you "own"? Attempting to read beyond the end of > the string *might* cause a segfault or access violation. Actually, HLA strings are guaranteed to be a multiple of four bytes long. They almost always begin on a dword boundary, too (not guaranteed, but for all HLA-compiler-generated strings this is always true). Of course, the stricmp routine I'm trying to write doesn't have to be fed HLA strings, but the truth is that end-users aren't going to call stricmp, they call routines like str.ieq or str.igt which, in turn, call stricmp, e.g., procedure str.igt( src1:string; src2:string ); @nodisplay; @noalignstack; begin igt; push( esi ); push( edi ); mov( src1, esi ); mov( src2, edi ); stricmp(); mov( 0, eax ); seta( al ); pop( edi ); pop( esi ); end igt; One of the main reasons I've not gone crazy trying out more than one byte at a time is because research (about 20 years old, now, so it may be slightly out of date) suggests that the average string found in a program is 10-15 characters long. With such small strings, tricks that process strings in bulk often wind up costing more than they save in the long run. That's one reason why, for example, using MMX or SSE for the operations is a complete waste of time (of course, cleaning up after an MMX operation is *expensive* too, not something you want in a general-purpose string comparison. I really should modify my tests to process a series of short strings, to provide a better cost comparison against strings you'd find in a typical program. It's rare that you get to run through 256 bytes, as my examples have done. Cheers, Randy Hyde
From: Randall Hyde on 8 Mar 2005 23:31 "Percival" <dragontamer5788(a)yahoo.com> wrote in message news:pan.2005.03.07.05.35.17.948947(a)yahoo.com... > On Mon, 07 Mar 2005 06:21:34 +0100, '\\o//'annabee wrote: > > > Pý Sun, 06 Mar 2005 23:39:02 GMT, skrev Randall Hyde > > <randyhyde(a)earthlink.net>: > > > >> Hi All, > > > > > > How does this compare ? Hope it is correct, as I was a bit tired when > > writing it, and havent really tested it fully. On my machine, it seems to > > run about twice as fast, in theese test, under saturated circumstances. > > (From Menu) Also, running on mediumlength normal strings, with a > > "outsticking" char here and there, its seems to run about twice as fast. > > > > But anyway.... I am far to tired to confirm if its correctly running. > > Later perhaps. ..... > > FasterCmp: > > Align 4 > > L0: mov dx wýeax | cmp dh 0 | je L8> | cmp dl 0 | je L9> | add ebx 2 | > > add eax 2 | cmp dx wýebx-2 | je L0< > > mov cx wýebx-2 > > and dx 0_5f5f > > and cx 0_5f5f > > add dx 02020 > > add cx 02020 > > cmp dx cx | je L0< > > ret > > L8: > > cmp dl býebx | jne L9> > > cmp býebx+1 0 > > L9: > > ret > > > > *bets Wannabee is kill filed* --- Forwarding message. Yes, indeed. Unfortunately, AFAICT, there is a minor problem with the "FasterCmp" routine. It doesn't explicitly check the characters to see if they are in the range "a".."z" or "A".."Z". Therefore, it thinks that characters like ` and @ are equivalent, '[' and '{' are equivalent, and ctrl-A maps to '!' (and so on). Nice try, though. I've tried to process two or four characters at a time, thus far I've not been successful. Particularly for shorter strings the overhead eats you alive. Cheers, Randy Hyde
From: Randall Hyde on 8 Mar 2005 23:35 "Octavio" <933191278(a)terra.es> wrote in message news:a3b1e869.0503071643.3653e1a4(a)posting.google.com... > "Randall Hyde" <randyhyde(a)earthlink.net> wrote in message news:<qwMWd.2584$oO4.1429(a)newsread3.news.pas.earthlink.net>... > > I'm also wondering about the results to return when the strings > > are not equal. In particular, the algorithm I've provided compares > > the original characters and sets the flags if those characters do > > not match. Therefore, 'a' > 'B'. I'm wondering if this is reasonable, > > or if I should return the comparisions of the "converted" characters > > (if they were uppercase)? I don't really care what *other* languages > > (e.g., C/C++) do, I'm interested in a discussion of what is reasonable > > to do here. > I think is reasonable to compare the converted characters, and also a > table lookup must be used to support chars like 'ýý'. This is just an ASCII string comparison routine, it doesn't need to support characters outside the range 0..$7F. > About the speedup, i think it must be done elsewere, the table lookup > can not be optimized very much. No doubt about that. If the table is not in cache, accessing the table kills you (and pollutes your cache, too). Granted, if you need to compare eight-bit character codes and they're not just ASCII values, a lookup table is probably the best way to go. Your comment did point out a defect, though. The AND instruction should be "and( $BFBF, ebx);" rather than $5F5F. No sense matching characters with their H.O. bits set if a non-ASCII character happens to come along. Cheers, Randy Hyde
From: Randall Hyde on 8 Mar 2005 23:37
"Frank Kotler" <fbkotler(a)comcast.net> wrote in message news:422CFD3F.5AA1F13D(a)comcast.net... > > You seem to be forcing characters to uppercase whether > they're alpha or not - this could result in a "false > positive" match. Exactly. You *must* compare the characters to see if they are in the range 'a'..'z' or 'A'..'Z'. I'm thinking there *might* be a way to remove a branch by subtracting 'A' and comparing against 26, but we'll see how that turns out. Cheers, Randy Hyde |