From: Branimir Maksimovic on
On Thu, 27 May 2010 22:50:22 -0400
Frank Kotler <fbkotler(a)myfairpoint.net> wrote:

>
> Groovy! What does it represent?
Bug. I reversed order of destination,source
operand (shrd) and forgot that source operand is unchanged.
Nathan suggested that rdtsc returns result in reverse order.


>
> Best,
> Frank
>

Greets!


--
http://maxa.homedns.org/

Sometimes online sometimes not

Svima je "dozvoljeno" biti idiot i
> mrak, ali samo neki to odaberu,


From: Frank Kotler on
Branimir Maksimovic wrote:
> On Thu, 27 May 2010 22:50:22 -0400
> Frank Kotler <fbkotler(a)myfairpoint.net> wrote:
>
>> Groovy! What does it represent?
> Bug. I reversed order of destination,source
> operand (shrd) and forgot that source operand is unchanged.

I had to look it up - I thought it shifted both operands, too.

> Nathan suggested that rdtsc returns result in reverse order.

Naw, we're just printing the bytes in "reversed order" - high nybble of
edx first, low nybble of eax last. Since we "get" them al-first,
Nathan's setting the direction flag to start at the "end" of the buffer,
and work toward the "front".

Foolish of me to have printed all of edx:eax anyway. There's nothing in
edx unless we're timing an insanely slow piece of code! So much simpler
to do 32 bits!

eax2hex:
mov ebx, xtbl
mov ecx, 8
..top:
rol eax, 4
push eax
and al, 0Fh
xlatb
stosb
pop eax
loop .top
ret

or so...

While fiddling with this thing, I thought I'd put "xlatb" in the loop,
just to see "How slow *is* it?". Well, of course that segfaults without
ebx being set to valid memory. Duh. So I put "mov ebx, xtbl" first -
right after "entry $", well before the first rdtsc. Still segfaulted, of
course - cpuid trashes ebx. Duh. But I noticed that having that
instruction there changes the timing of an empty loop! Alignment issue?
I dunno. Looks like it - 5 nops changes the timing (of an empty loop), 4
or 6 do not...

Andy, are you still getting 340 cycles for any block of code?

I still feel there's something I don't "get" here!

(did I mention how impressed I am by the size of Fasm's executables?)

Best,
Frank
From: Rod Pemberton on
"Frank Kotler" <fbkotler(a)myfairpoint.net> wrote in message
news:hto6nn$soc$1(a)speranza.aioe.org...
> Naw, we're just printing the bytes in "reversed order" - high nybble of
> edx first, low nybble of eax last. Since we "get" them al-first,
> Nathan's setting the direction flag to start at the "end" of the buffer,
> and work toward the "front".
>

Sorry, I didn't look at the routine... Can't he just change the starting
position and direction of the print on the screen, instead of the order of
the byte processing via DF? I.e., print byte to the right, move left a
char, print next byte, etc.


Rod Pemberton


From: Branimir Maksimovic on
On Fri, 28 May 2010 06:42:20 -0400
Frank Kotler <fbkotler(a)myfairpoint.net> wrote:

> Branimir Maksimovic wrote:
> > On Thu, 27 May 2010 22:50:22 -0400
> > Frank Kotler <fbkotler(a)myfairpoint.net> wrote:
> >
> >> Groovy! What does it represent?
> > Bug. I reversed order of destination,source
> > operand (shrd) and forgot that source operand is unchanged.
>
> I had to look it up - I thought it shifted both operands, too.
>
> > Nathan suggested that rdtsc returns result in reverse order.
>
> Naw, we're just printing the bytes in "reversed order" - high nybble
> of edx first, low nybble of eax last. Since we "get" them al-first,
> Nathan's setting the direction flag to start at the "end" of the
> buffer, and work toward the "front".

I see now.

>
> Foolish of me to have printed all of edx:eax anyway. There's nothing
> in edx unless we're timing an insanely slow piece of code!

Well, you have to watch for correctness. No one knows if one
would time slow piece of code!


So much
> simpler to do 32 bits!
>
> eax2hex:
> mov ebx, xtbl
> mov ecx, 8
> .top:
> rol eax, 4
> push eax
> and al, 0Fh
> xlatb
> stosb
> pop eax
> loop .top
> ret
>
> or so...
>
> While fiddling with this thing, I thought I'd put "xlatb" in the
> loop, just to see "How slow *is* it?".

It's just fancy mov operation limited to 256 bytes. (probably because
of speed)
You can do this same with mov eax,dword[eax]

Well, of course that segfaults
> without ebx being set to valid memory.

All memory is valid. segfault is there to warn on addressing error
(easier debugging). I remember days without segfults, that was fun!


Duh. So I put "mov ebx, xtbl"
> first - right after "entry $", well before the first rdtsc.

Why?

Still
> segfaulted, of course - cpuid trashes ebx. Duh.

;)

But I noticed that
> having that instruction there changes the timing of an empty loop!
> Alignment issue? I dunno. Looks like it - 5 nops changes the timing
> (of an empty loop), 4 or 6 do not...

?
Show me code. I don;t know what are you talking about.

>
> Andy, are you still getting 340 cycles for any block of code?
?
>
> I still feel there's something I don't "get" here!
>
> (did I mention how impressed I am by the size of Fasm's executables?)

fasm is made for hackers.

>
> Best,
> Frank

Greets!

--
http://maxa.homedns.org/

Sometimes online sometimes not

Svima je "dozvoljeno" biti idiot i
> mrak, ali samo neki to odaberu,


From: Nathan on
On May 28, 3:58 am, Branimir Maksimovic <bm...(a)hotmail.com> wrote:
>
> It goes in reverse order?
>

The 'std' is there to cause 'stosb' to decrement EDI because we want
to fill the right-most part of that string first -- that is where the
lower-order digits go.

Nathan.