From: Branimir Maksimovic on
On Fri, 28 May 2010 11:33:05 -0700 (PDT)
Nathan <nathancbaker(a)gmail.com> wrote:

> On May 28, 3:58 am, Branimir Maksimovic <bm...(a)hotmail.com> wrote:
> >
> > It goes in reverse order?
> >
>
> The 'std' is there to cause 'stosb' to decrement EDI because we want
> to fill the right-most part of that string first -- that is where the
> lower-order digits go.
>
> Nathan.

Thanks!

Branimir.

--
http://maxa.homedns.org/

Sometimes online sometimes not

Svima je "dozvoljeno" biti idiot i
> mrak, ali samo neki to odaberu,


From: Frank Kotler on
Rod Pemberton wrote:
> "Frank Kotler" <fbkotler(a)myfairpoint.net> wrote in message
> news:hto6nn$soc$1(a)speranza.aioe.org...
>> Naw, we're just printing the bytes in "reversed order" - high nybble of
>> edx first, low nybble of eax last. Since we "get" them al-first,
>> Nathan's setting the direction flag to start at the "end" of the buffer,
>> and work toward the "front".
>>
>
> Sorry, I didn't look at the routine... Can't he just change the starting
> position and direction of the print on the screen, instead of the order of
> the byte processing via DF? I.e., print byte to the right, move left a
> char, print next byte, etc.

I guess you could do that. How would you do it? Print backspace twice?
Or move the cursor with a bios int? (FWIW, printing 8 doesn't seem to
backspace in Linux - use VT100 sequences, I guess...)

Another way to deal with the "bytes in wrong order" problem is to store
'em in the buffer "as they come" and do a "revstring" on it before
printing. Or push 'em on the stack and pop 'em off in the "right" order.
I thought Nathan's approach of using std/cld worked out pretty well...

Pity we can't "rold eax, edx, 4"... which would rotate the high nybble
of edx into the low nybble of eax... if pigs had wings...

Best,
Frank

From: Rod Pemberton on
"Frank Kotler" <fbkotler(a)myfairpoint.net> wrote in message
news:htt678$m3i$1(a)speranza.aioe.org...
> Rod Pemberton wrote:
> > "Frank Kotler" <fbkotler(a)myfairpoint.net> wrote in message
> > news:hto6nn$soc$1(a)speranza.aioe.org...
> >> Naw, we're just printing the bytes in "reversed order" - high nybble of
> >> edx first, low nybble of eax last. Since we "get" them al-first,
> >> Nathan's setting the direction flag to start at the "end" of the
buffer,
> >> and work toward the "front".
> >>
> >
> > Sorry, I didn't look at the routine... Can't he just change the
starting
> > position and direction of the print on the screen, instead of the order
of
> > the byte processing via DF? I.e., print byte to the right, move left a
> > char, print next byte, etc.
>
> I guess you could do that. How would you do it?

Me? Direct write to the PC's text screen (mode 3), unless an OS is in my
way.

For 16-bit, ES set to b800h, BX set to screen location, using ES as override
on mov byte to BX. Bust word into nybbles in AL and AH.

For 32-bit, EBX set to screen location, lea to add in B8000h to ebx, mov
byte to EBX. Bust dword into nybbles in AL, AH, CL, CH.

Inc/dec BX or EBX as needed. Since byte sized moves, allows you to write
color info inbetween.

> Another way to deal with the "bytes in wrong order" problem is to store
> 'em in the buffer "as they come" and do a "revstring" on it before
> printing.

That works.

> Or push 'em on the stack and pop 'em off in the "right" order.

That works too. And, you don't need a string routine. DAS, DAA is 64-bit
obsoleted. Did you guys already consider BSWAP or XCHG to reverse the byte
order?

> I thought Nathan's approach of using std/cld worked out pretty well...
>

I vaguely recall some sort of issue with using rep movs in reverse order...
RBIL bugslist?


Rod Pemberton


From: Frank Kotler on
Branimir Maksimovic wrote:

....
>> While fiddling with this thing, I thought I'd put "xlatb" in the
>> loop, just to see "How slow *is* it?".
>
> It's just fancy mov operation limited to 256 bytes. (probably because
> of speed)
> You can do this same with mov eax,dword[eax]

More like "mov al, [table + eax]", no?

>> Well, of course that segfaults
>> without ebx being set to valid memory.
>
> All memory is valid.

Matter of opinion. My OS says no. :)

> segfault is there to warn on addressing error
> (easier debugging). I remember days without segfults, that was fun!

Agreed! Of course we can still hit "valid" memory, but not the memory we
had in mind. Then we're back to "interesting" debugging. :)

> Duh. So I put "mov ebx, xtbl"
>> first - right after "entry $", well before the first rdtsc.
>
> Why?

So it wouldn't segfault! Didn't work, of course...

> Still
>> segfaulted, of course - cpuid trashes ebx. Duh.
>
> ;)

Apparently "genuine intel" is not a "valid" address. :)

>> But I noticed that
>> having that instruction there changes the timing of an empty loop!
>> Alignment issue? I dunno. Looks like it - 5 nops changes the timing
>> (of an empty loop), 4 or 6 do not...
>
> ?
> Show me code. I don;t know what are you talking about.

Okay... It's *your* code (with Nathan's changes)...

; fasm myprog.asm
;
; from Branimir Maksimovic
; bugfixes from Nathan Baker
; cruft from fbk :)

format ELF executable

segment writeable executable

entry $

; five bytes here changes the timing
;mov ebx, xtbl

;nop
;nop
;nop
;nop
;nop
;nop ; six bytes changes it back

mov ecx,16
l1:
push ecx

; serialize CPU and get start time
cpuid
rdtsc
push edx
push eax

; code to be timed
;--------------
;das
;push eax
;pop eax
;push eax
;pop eax
;--------------

; serialize cpu and get end time
cpuid
rdtsc

; calculate difference
pop ebx
sub eax, ebx
pop ecx
sub edx, ecx

; convert number to text
mov edi, ascbuf
call u64toha

; print it
mov ecx, ascbuf
mov edx, 17
mov ebx, 1
mov eax, 4
int 80h

; do more
pop ecx
loop l1

exit:
mov eax, 1
mov ebx,0
int 80h

xtbl db 30h,31h,32h,33h,34h,35h,36h,37h,38h,39h,41h,42h, \
43h,44h,45h,46h

; I changed the name of this - 'd' implied "decimal"...
u64toha:
add edi, 15
mov ebx,xtbl
mov cl, 16
std
l2:
mov ch,al
and al,0xf
xlatb
stosb
mov al,ch
; shrd edx,eax,4
shrd eax,edx,4
shr edx, 4
dec cl
jz e1
; mov byte[edi], ','
; inc edi
jmp l2

e1:
cld
ret

ascbuf db 17 dup (0xa)
;---------------------------

My output from this is "21C" (with a bunch of zeros in front). With the
"five byte padding" uncommented, it goes to "220". All we're "timing" is
push edx/push eax/cpuid... is cpuid sensitive to alignment??? I would
expect that if five bytes changes it, one byte would, too - but it
doesn't (your mileage may vary)...

That first output you posted - varying between "1" and "A" - was that
for an empty loop, or was that with "das" in there? I'm getting
consistent results (if sometimes puzzling) for all 16 iterations... with
anything but "das"...

Best,
Frank

From: Frank Kotler on
Rod Pemberton wrote:

....
>>> Sorry, I didn't look at the routine... Can't he just change the
> starting
>>> position and direction of the print on the screen, instead of the order
> of
>>> the byte processing via DF? I.e., print byte to the right, move left a
>>> char, print next byte, etc.
>> I guess you could do that. How would you do it?
>
> Me? Direct write to the PC's text screen (mode 3), unless an OS is in my
> way.

You guys with no OS in the way have got all the luck! :)

In Linux, we could open /dev/vcsa0 (?) and write char, attribute to it.
Kinda similar to "direct to screen". That would probably work.

> Did you guys already consider BSWAP or XCHG to reverse the byte
> order?

I didn't. That's an interesting idea. I've used xchg eax, edx, call a
32-bit-to-hex routine, xchg again, call again. (this assumes the routine
doesn't trash edx... not safe for C!)

>> I thought Nathan's approach of using std/cld worked out pretty well...
>
> I vaguely recall some sort of issue with using rep movs in reverse order...
> RBIL bugslist?

Yeah... too lazy to look it up right now... if an interrupt occurs, the
prefix can be "skipped" on return from the interrupt, or something?
Shouldn't bother us here, I don't think...

Best,
Frank