NASM 0.98.39 vs. NASM 2.03.01 disassembly [ASM]

Prev: announce: my very first disassembler now available (GPL)
Next: Win32 non blocking console input?

From: Rod Pemberton on 7 Sep 2008 15:39

"Chuck Crayne" <ccrayne(a)crayne.org> wrote in message
news:20080906212136.47e9f787(a)thor.crayne.org...
> On Sat, 6 Sep 2008 20:57:10 -0400
> "Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote:
>
> > "It" being whatever logic receives the selector. "It" will only need
> > to latch 16-bits.
>
> But, in fact, there is no "It" which receives 16 bits.

"In fact", how do you know this is fact? Is one of Intel's cpu schematics
available online or by book? Or, are you basing this on alternate cpu
designs, perhaps discrete 74LSxx or bit-slice logic or a Byte magazine
article, that you're familiar with?

> "It.index"
> receives 13 bits; "It.ti" receives 1 bit; and "It.rpl" receives 2 bits.

The selector data as bits in the register must be connected to logic
somewhere which, in all likelihood or at least IMO, are data lines coming
off of some internal latch... not directly connected to the source register.

What you're stating and/or implying is that:
1) the register itself is placed onto the alu, shifter etc. data bus
2) the control lines for "It.ti" and "It.rpl" etc. are then directly
connected to the bus
3) the register remains on the bus for the, what'd you say, 20 machines
cycles, while "It.ti" and "It.rpl" get around to accessing the "held
forever" data on the bus

Poor design?... You effectively block all use of the cpu's central data bus
that the register is connected to. Isn't this the central bus connected to
the accumulator, shifter, and other computational logic? My guess is that
it is - at least in earlier cpu designs which didn't have additional data
busses.

If you can believe that they implemented additional logic to bypass the data
size and override functionality for just these two instructions, I don't
understand why you can't believe that they also used additional logic to
hold the entire selector for various Group 6/7, and MOVSX/ZX instructions in
order to "free" the main data bus for reuse.

> Furthermore, these "receives" do not happen in the same clock cycle.

How do you know this? How do you know there isn't a single primary
"receive" of the selector from the register which then is "broken down" into
multiple secondary "receives"? E.g., latched selector from the register,
data lines for sub-elements of selector coming off of latch.

> Furthermore, these "receives" do not happen in the same clock cycle.

To me, it's likely a single internal latch/register saved the selector for
use when "It.ti" and "It.rpl" need them. Did they design it that way? I
have no idea. I'd hope they did since the data bus off of the registers,
alu, shifters, etc. is likely to be needed for intermediate data used in the
computation of the segment limit. I.e., how can they hold the selector on
the data bus for 20 cycles and still use that bus to perform the operations
needed by those instructions? Without multiple data busses, I don't see how
they can.

> Taking a step backwards from my logic design tutorial, the point I am
> trying to make is that your attempt to build a compelling logical chain
> from how you think the hardware works to what syntax the NASM debugger
> should use is unconvincing.

The syntax needs to fit the mode for a number of reasons:

1) LAR/LSL manual descriptions have consistently (twenty+ years) indicated
there should be mode appropriate syntax unlike a small number other
instructions: Group 6/7, MOVZX/SX...
2) it seems likely that the Ew isn't correct for a register source, likely a
'v', - as now indicated on other similar instructions
3) preservation of mode correct register size on conversion of disassembly
into assembly, e.g., for feedback into NASM or other assemblers
4) use of 32-bit and 8-bit registers in 32-bit code when NASM has no
function/macro to convert one register to sub-registers or "parent" register
(e.g., eax to al, or al to ax) for use with instructions such as MOVZX, LAR,
etc.

Rod Pemberton

From: Chuck Crayne on 7 Sep 2008 17:34

On Sun, 7 Sep 2008 15:39:20 -0400
"Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote:

> Or, are you basing this on alternate cpu
> designs, perhaps discrete 74LSxx or bit-slice logic or a Byte magazine
> article, that you're familiar with?

Actually, I'm basing it on the fact that I used to make my living
optimizing cpu gates. I do not know for certain what my counterparts at
Intel did, but I do know what is reasonable -- and what is not.

> The selector data as bits in the register must be connected to logic
> somewhere which, in all likelihood or at least IMO, are data lines
> coming off of some internal latch... not directly connected to the
> source register.

Why invent some redundant register, when the data is already in a
register?

> What you're stating and/or implying is that:
> 1) the register itself is placed onto the alu, shifter etc. data bus
The output of the register is gated to a data bus. The bus, in turn, can
be gated to the inputs of other logic elements.

> 2) the control lines for "It.ti" and "It.rpl" etc. are then directly
> connected to the bus
No. It is a data bus, not a control line bus. One could stretch a
point, and call the ti bit a control signal, but the rpl bits are data,
and must be gated into an adder along with the cpl bits.

> 3) the register remains on the bus for the, what'd you say, 20
> machines cycles, while "It.ti" and "It.rpl" get around to accessing
> the "held forever" data on the bus

No. The output of the register is gated to the data bus only during
those machine cycles when one or more of its bits are needed. The bus
itself is just a set of "wires" and does not "hold" anything.

> If you can believe that they implemented additional logic to bypass
> the data size and override functionality for just these two
> instructions,

I do not, and have never, believed this, and you have never provided
any explanation of why you believe it would be necessary to add logic
to not check a bit which doesn't apply to a large number of
instructions.

For example all 8-bit register transfers would also require this
additional logic, if your theory is correct.

> How do you know there isn't a single primary
> "receive" of the selector from the register which then is "broken
> down" into multiple secondary "receives"? E.g., latched selector
> from the register, data lines for sub-elements of selector coming off
> of latch.

Are you seriously suggesting the use of separate data buses for each of
the sub-fields? Have you thought about where these "data lines" have to
go? Both the index and the rpl sub-fields have to be gated into an
adder, but not at the same time. Why would one not do this directly from
the main data bus?

> I.e., how can they hold the selector on
> the data bus for 20 cycles and still use that bus to perform the
> operations needed by those instructions? Without multiple data
> busses, I don't see how they can.

Since the source register is not modified during the execution of this
instruction, it can be gated to the data bus whenever it is needed, and
the bus can be used for other purposes the rest of the time.

--
Chuck
http://www.pacificsites.com/~ccrayne/charles.html

From: Chuck Crayne on 7 Sep 2008 20:11

On Sun, 7 Sep 2008 15:39:20 -0400
"Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote:

> The syntax needs to fit the mode for a number of reasons:

This is already the case for NASM itself, which supports 12 different
combinations of source and destination types for lsl. The discussion is
solely about which of these dozen forms NDISASM should report.

Personally, I don't really care, because lsl and lar are protected mode
instructions, and few, if any, programmers who write protected mode
code use NDISASM. However, even if I did care, I still wouldn't find
your arguments convincing, for the following reasons.

> 1) LAR/LSL manual descriptions have consistently (twenty+ years)
> indicated there should be mode appropriate syntax unlike a small
> number other instructions: Group 6/7, MOVZX/SX...

NASM has been been enhanced to support the latest x86 and x86_64
architectural enhancements, including 256-bit AVX, so any syntax
decisions need to be based upon current manual descriptions, and not
ancient history.

> 2) it seems likely that the Ew isn't correct for a register source,
> likely a 'v', - as now indicated on other similar instructions

It seems likely to you, but you have not yet presented any convincing
arguments for this position. In the meantime, it seems reasonable to
assume that the manual is correct.

> 3) preservation of mode correct register size on conversion of
> disassembly into assembly, e.g., for feedback into NASM or other
> assemblers

When fed back into NASM, all twelve forms will assemble correctly.
There is no way for NDISASM to tell what source register size was
originally specified, because they all produce identical op codes.

> 4) use of 32-bit and 8-bit registers in 32-bit code when
> NASM has no function/macro to convert one register to sub-registers
> or "parent" register (e.g., eax to al, or al to ax) for use with
> instructions such as MOVZX, LAR, etc.

I don't understand what you are trying to say here. Perhaps an example
would be in order.

--
Chuck
http://www.pacificsites.com/~ccrayne/charles.html

From: Rod Pemberton on 7 Sep 2008 21:37

"Chuck Crayne" <ccrayne(a)crayne.org> wrote in message
news:20080907171142.43785c65(a)thor.crayne.org...
> On Sun, 7 Sep 2008 15:39:20 -0400
> "Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote:
>
> > 2) it seems likely that the Ew isn't correct for a register source,
> > likely a 'v', - as now indicated on other similar instructions
>
> It seems likely to you, but you have not yet presented any convincing
> arguments for this position.

Whatever...

> In the meantime, it seems reasonable to
> assume that the manual is correct.

Yes, you keep repeating that the manual is correct, but keep ignoring this:
_BOTH_ the current Intel and AMD manuals clearly indicates syntax for 32-bit
registers in appropriate for LAR/LSL.

> > 3) preservation of mode correct register size on conversion of
> > disassembly into assembly, e.g., for feedback into NASM or other
> > assemblers
>
> When fed back into NASM, all twelve forms will assemble correctly.

"As a courtesy..." N64D.

> There is no way for NDISASM to tell what source register size was
> originally specified,

Specified by user...

> > 4) use of 32-bit and 8-bit registers in 32-bit code when
> > NASM has no function/macro to convert one register to sub-registers
> > or "parent" register (e.g., eax to al, or al to ax) for use with
> > instructions such as MOVZX, LAR, etc.
>
> I don't understand what you are trying to say here. Perhaps an example
> would be in order.

Let's say you want "MOVZX EAX,AX" in 32-bit code.

BITS 32
%macro crud 3
;...
movzx %1,%1
;...
%endmacro

crud eax, 0, 2

If NASM doesn't accept EAX for the second %1 (as in LAR/LSL) how do you
easily, without numerous %ifidn %assign etc., convert EAX to AX?

Rod Pemberton

From: Rod Pemberton on 7 Sep 2008 22:41

"Chuck Crayne" <ccrayne(a)crayne.org> wrote in message
news:20080907143458.0e79361e(a)thor.crayne.org...
> On Sun, 7 Sep 2008 15:39:20 -0400
> "Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote:
>
> > Or, are you basing this on alternate cpu
> > designs, perhaps discrete 74LSxx or bit-slice logic or a Byte magazine
> > article, that you're familiar with?
>
> Actually, I'm basing it on the fact that I used to make my living
> optimizing cpu gates. I do not know for certain what my counterparts at
> Intel did, but I do know what is reasonable -- and what is not.

Mostly by software?

> I do know what is reasonable -- and what is not.
....

> > The selector data as bits in the register must be connected to logic
> > somewhere which, in all likelihood or at least IMO, are data lines
> > coming off of some internal latch... not directly connected to the
> > source register.
>
> Why invent some redundant register, when the data is already in a
> register?

Why invent some redundant gated inputs, when the data is already in a
register? You could use regular non-gated inputs...

Why gate at all? Much logic is not locked to a clock except by design.
Overclockers have been asking this one too... asynchronous.

> > 3) the register remains on the bus for the, what'd you say, 20
> > machines cycles, while "It.ti" and "It.rpl" get around to accessing
> > the "held forever" data on the bus
>
> No. The output of the register is gated to the data bus only during
> those machine cycles when one or more of its bits are needed. The bus
> itself is just a set of "wires" and does not "hold" anything.

The "wires" are driven by the gates/latches and "hold" one of two potential
voltages representative of "0" or "1" for binary logic - as long as they are
being driven by the gates/latches - indirectly from the power source, of
course. Do you want to tell me a capacitor doesn't "hold" potential energy
in an electrostatic field? Do you want to tell me an inductor doesn't
"hold" potential energy in a magnetic field?

> > If you can believe that they implemented additional logic to bypass
> > the data size and override functionality for just these two
> > instructions,
>
> I do not, and have never, believed this, and
....

> you have never provided
> any explanation of why you believe it would be necessary to add logic
> to not check a bit which doesn't apply to a large number of
> instructions.

You've never provided any explanation why you believe it would not be
necessary to add logic to modify the "checking of a bit" which does apply to
a large number of instructions.

Here is my answer to your question:

Inaction requires no logic circuitry.
Action requires logic circuitry.
Ignoring or modifying the action of an in use action requires more
logic circuitry.

I.e., the circuitry is already checking a bit which applies to a larger
number instructions. To modify that primary action, requires more logic.

> For example all 8-bit register transfers would also require this
> additional logic, if your theory is correct.

Logic is required to select between 8-bits and non-8-bits. More logic is
required to decide whether the non-8-bits is 16-bits or 32-bits. Even more
logic is required to turn off the primary non-8-bit selection action for
certain instructions.

> > How do you know there isn't a single primary
> > "receive" of the selector from the register which then is "broken
> > down" into multiple secondary "receives"? E.g., latched selector
> > from the register, data lines for sub-elements of selector coming off
> > of latch.
>
> Are you seriously suggesting the use of separate data buses for each of
> the sub-fields?

No, that was what you seemed to be stating. I was suggesting that the
sub-fields may be connected to a latch instead of the main bus since the
main bus may be needed for other tasks.

> Have you thought about where these "data lines" have to
> go? Both the index and the rpl sub-fields have to be gated into an
> adder, but not at the same time. Why would one not do this directly from
> the main data bus?

Increases total instruction time by intermittantly preventing bus use.
Data bus contention with other operations needed by the instruction.
The sub-field logic may need data faster than the register can be re-gated
onto the bus.
The sub-field logic may need data between clocks.
To guarantee proper operation through design for special markets.

> Why would one not do this directly from
> the main data bus?

If you were the logic optimizer, you should be answering this, shouldn't
you?

> > I.e., how can they hold the selector on
> > the data bus for 20 cycles and still use that bus to perform the
> > operations needed by those instructions? Without multiple data
> > busses, I don't see how they can.
>
> Since the source register is not modified during the execution of this
> instruction, it can be gated to the data bus whenever it is needed, and
> the bus can be used for other purposes the rest of the time.

Trade-off between slowing down the instruction at the expense of not using a
latch.

Rod Pemberton

First | Prev | Next | Last
Pages: 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Prev: announce: my very first disassembler now available (GPL)
Next: Win32 non blocking console input?