NASM 0.98.39 vs. NASM 2.03.01 disassembly [ASM]

Prev: announce: my very first disassembler now available (GPL)
Next: Win32 non blocking console input?

From: Rod Pemberton on 6 Sep 2008 13:11

"Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote in message
news:g9udeh$482$1(a)aioe.org...
>
> "Chuck Crayne" <ccrayne(a)crayne.org> wrote in message
> news:20080905152743.0cc0e2e3(a)thor.crayne.org...
> On Fri, 5 Sep 2008 13:32:42 -0400
> "Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote:
>
> > > > Nor does the concept of "default register size" apply to the source
> > > > register.
> > >
> > > Why not? It should apply to to all registers for the mode.
> >
> > What you call the "default register size" is nothing more that a bit in
> > the current cs descriptor, which must be tested by the cpu logic for
> > those parts of an instruction which have both 16 and 32 bit forms. In
> > the case of the source register for lsl, there is no need to test this
> > bit, because (as the manual makes quite clear) the cpu always uses 16
> > bits.
>
> While there is no need to test the bit, not doing so requires extra logic
be
> implemented for just a few instructions. Why would they use more
> gates/logic just to force 16-bits onto the bus for a few instructions,
> instead of just placing the size selected by the descriptor (and
overrides)
> onto the bus? I.e., they'd have to implement logic to hardcode an
implicit
> and non-overridable size override for 32-bit mode. I can't see why they'd
> waste gates on this when a simpler more generic solution works.
>
> > > A
> > > selector is just a numerical index, i.e., n*8, n=0,1,2,3..., into the
> > > descriptor table.
> >
> > I quote (i.e. cut and paste) from the manual:
> >
>
> Yes, I said sorry to you in the reply to HK.

WK... sigh. :)

> I haven't used the lower bits
> in my OS. And, it seems that while I read the descriptor section
> thoroughly some time ago, I ignored the selector section for some unknown
> reason...
>
>

RP

From: Chuck Crayne on 6 Sep 2008 18:49

On Sat, 6 Sep 2008 13:07:57 -0400
"Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote:

> I can't see why they'd
> waste gates on this when a simpler more generic solution works.

In the case of the source register, the only reasonable design is to
ignore the default mode bit and overrides, and always gate the
full-width output of the physical register onto the data bus.

Consider, for example mov ax,bx on a 32-bit machine. There is
no benefit to gate only the low order 16 bits of ebx, because every
data line on the bus will always have some value, and the only way to
preserve the high order bits of eax is to open its input gates only for
the lower half.

So, the "simpler more generic solution" is to do the register
length checking only for the destination register. However, in the case
of lsl, there is no single destination to which the source register is
moved. Each of the sub-fields in the source register has a different
destination and a different fixed size -- none of which are either 16
or 32 bits.

Nor are these sub-fields all used at the same time. If you look at the
timing charts for the 80386, you will see that lsl requires a minimum
of twenty machine cycles -- which is a full order of magnitude greater
than the two cycles required for a move.

This means that 90% of lsl's execution time is spent in logic which is
not shared with the general data transfer or arithmetic instructions.
Only when it is time to store the segment limit value in the
destination register does it enter the common logic, and only then is
the default length checked.

--
Chuck
http://www.pacificsites.com/~ccrayne/charles.html

From: Rod Pemberton on 6 Sep 2008 20:57

"Chuck Crayne" <ccrayne(a)crayne.org> wrote in message
news:20080906154928.41ac443f(a)thor.crayne.org...
> On Sat, 6 Sep 2008 13:07:57 -0400
> "Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote:
>
> > I can't see why they'd
> > waste gates on this when a simpler more generic solution works.
>
> In the case of the source register, the only reasonable design is to
> ignore the default mode bit and overrides, and always gate the
> full-width output of the physical register onto the data bus.
>

If only 16-bits of source register is needed for a selector, the reasonable
design is to let the mode size and override logic work normally:

"It" will have 16-bits in 16-bit mode on the data bus from the source
register without an override.
"It" will have at least 16-bits in 16-bit mode on the data bus from the
source register with an override.
"It" will have at least 16-bits in 32-bit mode on the data bus from the
source register without an override.
"It" will have 16-bits in 32-bit mode on the data bus from the source
register with an override.

"It" being whatever logic receives the selector. "It" will only need to
latch 16-bits.

> Consider, for example mov ax,bx on a 32-bit machine. There is
> no benefit to gate only the low order 16 bits of ebx, because every
> data line on the bus will always have some value, and the only way to
> preserve the high order bits of eax is to open its input gates only for
> the lower half.

While this example may be true for "mov ax,bx", as you indicate below, the
logic for lsl is more complicated and not simply a direct transfer as in the
example. Therefore, I think using this example as justification for the
next paragraph is invalid.

Since this example you presented may be true, I'm glad you finally agree
that the "effective" source and destination sizes should agree... Even
though 32-bits of ebx is presented to the bus by the logic, effectively only
16-bits of ebx, i.e., bx, are transferred to ax. The same would hold for
32-bits. E.g., source size = destination size.

> So, the "simpler more generic solution"

Debatable...

> is to do the register
> length checking only for the destination register.

Yes, the result of the lsl operation must match the destination register
size.

> However, in the case
> of lsl, there is no single destination to which the source register is
> moved. Each of the sub-fields in the source register has a different
> destination and a different fixed size -- none of which are either 16
> or 32 bits.
>
> Nor are these sub-fields all used at the same time. If you look at the
> timing charts for the 80386, you will see that lsl requires a minimum
> of twenty machine cycles -- which is a full order of magnitude greater
> than the two cycles required for a move.
>
> This means that 90% of lsl's execution time is spent in logic which is
> not shared with the general data transfer or arithmetic instructions.

Plausible explanation.

> Only when it is time to store the segment limit value in the
> destination register does it enter the common logic, and only then is
> the default length checked.

I thought you were insisting on Ew for the destination register... Or, are
we ignoring the default mode bits and overrides because it's the only
reasonable design?

Have you noticed that some instructions historically Ew are now Rv/Mw in
newer manuals? They must've missed a couple... ;-)

Rod Pemberton

From: NathanCBaker on 6 Sep 2008 23:33

On Aug 30, 9:25 pm, "JC" <jcarl...(a)127.0.0.1> wrote:
> "Frank Kotler" <fbkot...(a)verizon.net> wrote...
>
> Just to let everyone know, Agner updated the objconv utility today,
> Saturday, 2008-08-30.
>

Just to let everyone know, there is a new [ well, actually it says
"updated 21-Aug-2008" but I was distracted by all the elaborate
advertisement / movie / cultural event / news / docu-drama designed to
convince me which brand ( Democrat or Republican ) of jeans I should
purchase ] version of OllyDbg:

http://www.ollydbg.de/

Nathan.

From: Chuck Crayne on 7 Sep 2008 00:21

On Sat, 6 Sep 2008 20:57:10 -0400
"Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote:

> "It" being whatever logic receives the selector. "It" will only need
> to latch 16-bits.

But, in fact, there is no "It" which receives 16 bits. "It.index"
receives 13 bits; "It.ti" receives 1 bit; and "It.rpl" receives 2 bits.
Furthermore, these "receives" do not happen in the same clock cycle.
"It.rpl" must first be compared to cpl, before anything else is allowed
to happen. Next, "It.ti" must be used to fetch the address of the
selected table; and only then can "It.index" be gated into one side of
the adder. Nor is there any obvious reason for any these "It"s to latch
any bits at all, since they are already latched in the source register.

> I thought you were insisting on Ew for the destination register...

Not at all. The manual makes it quite clear that the destination
register can be 16, 32, or 64 bits.

Taking a step backwards from my logic design tutorial, the point I am
trying to make is that your attempt to build a compelling logical chain
from how you think the hardware works to what syntax the NASM debugger
should use is unconvincing.

--
Chuck
http://www.pacificsites.com/~ccrayne/charles.html

First | Prev | Next | Last
Pages: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Prev: announce: my very first disassembler now available (GPL)
Next: Win32 non blocking console input?