NASM 0.98.39 vs. NASM 2.03.01 disassembly [ASM]

Prev: announce: my very first disassembler now available (GPL)
Next: Win32 non blocking console input?

From: Rod Pemberton on 19 Aug 2008 06:25

"Frank Kotler" <fbkotler(a)verizon.net> wrote in message
news:g8d9fm$pfq$1(a)aioe.org...
> Rod Pemberton wrote:
> > "Frank Kotler" <fbkotler(a)verizon.net> wrote in message
> > news:g8an22$v6v$1(a)aioe.org...
> >> Interesting. What would be the "meaning" of a (16-bit!) selector in a
> >> 32-bit reg?
> >
> > Is this a trick question?
>
> Not intended to be.

OK. The meaning of a 16-bit selector in a 32-bit reg is the same as the
meaning
of a 16-bit selector in a 16-bit reg... (How was that not a trick question
again?)

> >> If the upper bits are "garbage", it won't work?
> >
> > Are you implying that 16-bit selectors in 32-bit mode as implemented
> > currently don't actually work?
>
> I certainly didn't intend to imply that!

Then, it must've been a trick question... since there is no difference in
meaning, usage, or implementation of a 16-bit selector in a 32-bit reg,
AFAICT, and you didn't intend to imply that.

> Let me try this again... Assume 32-bit code...
>
> You write:
>
> lsl eax, ebx
>
> I write:
>
> lsl eax, bx
>
> What's the difference?

One is a valid 32-bit instruction. One isn't AFAICT (even with
overrides...).

> Same machine code,

Where do you get this?

I posted the encoding information from _four_ respectable manuals. LSL
r32,r16 isn't present anywhere. (I granted you that interpretation for one
manual, remember?) At a minimum, "lsl eax,bx" would require an override for
32-bit code to generate "bx" instead of "ebx" (but that would also require a
change to "eax"), i.e., the code must be different.

(If you don't agree to this interpretation, then Ndisasm's decoding of
override prefixes is broken for all instructions other than lsl and lar.
Because, that's how they all work.)

> >> This looks
> >> like a change from m32 to m16 somewhere between 2003 and 2006.
> >
> > Little endian with 16-bit selectors. Basically, irrelevant if the cpu
> > ignores retrieving the upper 16-bits of a 32-bit value from memory...
>
> ... Ah, but here it *does* make a difference if the instruction reads 32
> bits and discards 16!

Why? It doesn't make a difference under normal circumstances. It only
exhibits a problem under special circumstances: misaligned memory read split
across a boundary of unmapped memory.

> ... Ah, but here it *does* make a difference if the instruction reads 32
> bits and discards 16! Only if the 16 bits in question is the last 16
> bits of readable memory, to be sure...

What does "little endian" mean to you?

In memory, little endian 16-bit value 0x1100 is stored (with 0x00 byte at
the lower address):

00 11

In memory, little endian 32-bit value 0x33221100 is stored:

00 11 22 33

I.e., a valid read (no faults...) from the stored address of 16-bits and of
32-bits which discards the upper 16-bits will both have the lower 16-bits be
0x1100.

> This is the experiment that Phil
> proposed. On the only processor I've tested it on - P4 - it definitely
> reads only 16-bits.

How does one align a 32-bit general purpose register on an unmapped memory
boundary to generate an alignment trap to determine if half of or the whole
of a 32-bit register is used? (This is a trick question.)

> >> or did they fix an error in the manual?
> >
> > No, I don't believe so.
>
> If I see results of an *experiment* that shows lar/lsl reading 32 bits
> with some CPU, I'll agree with you.

That's an invalid conclusion. The conclusion only applies to the memory
operand as source, not either of the register operands (destination or
source). The size of the memory read has no effect on the size of the
register used. You can see that LSL can use 32-bit registers for _both_,
while accessing memory as 16-bits for the second, according to one of the
manuals. Look:

0F 03 /r LSL r32, r32/m16

> >> I would "expect" the source operand to be 16-bits,
> >
> > ABSOLUTELY NOT! If the cpu is in 32-bit mode and the source operand is
a
> > register, the register size is either 32-bits or 8-bits.
> >
> >> regardless of the
> >> processor mode or size of the destination register, a selector being 16
> >> bits.
> >
> > In 16-bit mode, you have two register sizes: 8-bit and 16-bit. In
32-bit
> > mode, you have two register sizes: 8-bit and 32-bit.
>
> Are you implying that:
>
> lsl ax, bx
>
> won't work in 32-bit code?

No, I'm not. IMO, you seem to be confusing/merging the encoding/decoding
and instruction operation.

The question is either:

"Is lsl ax,bx encoded as 16-bits, then executed as 32-bits?"

*OR*

"Is lsl ax,bx encoded in 32-bit mixed mode and executed as 32-bits?"

If "lsl ax, bx" was encoded as 16-bits, and then executed in 32-bit mode,
it'll be executed as "lsl eax, ebx". It'll work. But, it won't work as
coded. The fact that this instruction may have the identical operational
result is irrelevant. If decoded as 32-bits, it should decode as "lsl eax,
ebx" not "lsl eax, bx".

If "lsl ax, bx" was encoded as 32-bits in mixed mode (i.e., with override),
and then executed in 32-bit mixed mode, it'll be executed as the equivalent
16-bit code due to the override.

> >> As Phil suggests, we could conduct the experiment. I have done so,
> >> both loading upper bits of ebx with garbage, and putting a 16-bit
> >> variable in the last two bytes of valid memory. I conclude that the
> >> source operand is 16 bits...
> >
> > I think that's a wrong conclusion. You can conclude that only 16-bits
of
> > the source operand are used as a selector. But, you can't determine
what
> > size 32-bits or 16-bits was read in order to obtain the 16-bit selector.
>
> I certainly can! If it were reading 32 bits, it'd segfault!

That needs to be qualified: "If it were reading 32 bits" [of memory, under
special conditions] "it'd segfault!" So, no, you can't legitimately make
that claim since we're also dealing with register reads not just memory
reads. You're willfully ignoring a register source operand and assuming
that what's true for memory source operand size is also true for the
register source operand size.

> > My whole point is how do you get "bx", a 16-bit register, instead of
"ebx"
> > for an instruction decode which can only return an 8-bit or 32-bit
register?
>
> You got me!

Why are you surprised? That is what Ndisasm is doing to lsl, and lar
decodes... It's emitting a 16-bit register for a 32-bit one, erroneously.

> What CPU would that be???

AFAIK, we've been talking about Ndisasm instruction decoding, not CPU's in
general nor a specific one.

Rod Pemberton

From: Rod Pemberton on 19 Aug 2008 06:25

"Frank Kotler" <fbkotler(a)verizon.net> wrote in message
news:g8ddve$oqf$1(a)aioe.org...
> > "Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote in message
> > I don't believe the decode for lsl, and
> > lar should be treated the same, i.e., using a fixed 16-bit register, as
> > those for arpl, lldt, lmsw, ltr, str, verr, and verw.
> >
> > AFAICT, although there are some instructions fixed to 16-bit registers
only,
> > there currently is no 32-bit instruction that uses _both_ a 32-bit
register
> > and a 16-bit register as Ndisasm is "doing to" lar and lsl...
>
> mov eax, ds ?

That's not against a GP register. It's against a segment register which is
16-bits.

> mov ax, ds
> mov eax, ds ; two different instructions!

Please post the manual that has an encoding for "mov r32,Sreg".
(I.e., AFAICT r16 is the norm and other sizes are only listed as valid for
AMD64's...)

> I really think lar/lsl are the "same idea". From 8E D8 I would
> disassemble (in 32-bit mode) ax, not eax. And from 0F 03 C3 I would
> disassemble bx, not ebx. Since it doesn't make the slightest difference,

But, it does make a difference. Think about someone who is using NASM's
macros, e.g., for a register allocator. For 32-bit mode code, the macros
are going to use an either an 8-bit or a 32-bit register. The macros aren't
going to use 16-bit registers. By implementing the instruction
decode/encode as 16-bits for a GP register, because it's "irrelevant", you
just broke the use of macros for that instruction.

> How would you disassemble EC? "in al, edx"???

IN AL,DX

DX is hardcoded. DX is part of the instruction operation, but it's not part
of the instruction encoding/decoding.

Rod Pemberton

From: Wolfgang Kern on 19 Aug 2008 11:10

Rod Pemberton wrote:
....
> That needs to be qualified: "If it were reading 32 bits" [of memory, under
> special conditions] "it'd segfault!" So, no, you can't legitimately make
> that claim since we're also dealing with register reads not just memory
> reads. You're willfully ignoring a register source operand and assuming
> that what's true for memory source operand size is also true for the
> register source operand size.

about Caesars Beard Rod ? :)
We proved it for memory operands. And I really wont care if the CPU
internal reads and truncates because we wont see any effect on this.

But when I look at the CPU's decoder there is never a operand size
difference between mod3 and mod0..2 as long it's not another function.
The only exception is PUSH/POP-seg with the stackpointer advance.

I checked it on AMD K7 K8 and DX486:
pm32: mov ebx,0x98760038 ;0038 is unlimited data
0f 03 c3 lsl eax,ax ;I get eax = 0xffffffff
mov eax,0xabcd0038
66 0f 03 c3 lsl ax,bx ;eax = 0xabcdffff after this

pm16: ;DS=0008
mov [0xfffe],0x0038 ;0008 is 16 bit data (limit: 0xffff)
66 0f 03 06 fe ff
lsl eax, [0xfffe] ;eax = again 0xffffffff
;! no segfault !

> Why are you surprised? That is what Ndisasm is doing to lsl, and lar
> decodes... It's emitting a 16-bit register for a 32-bit one, erroneously.

Yeah, for PM32 LAR/LSL there shouldn't be a 66h in the assembler.
But I see no error in LSL eax,bx
perhaps just because my disass also show it this way ? :)

__
wolfgang

From: Wolfgang Kern on 20 Aug 2008 17:48

Rod Pemberton wrote:

> I'm still not sure what the justification for that tenacious belief is...

Yes Rod,
you have to say what you found in conflict with the manuals...

>
http://sourceforge.net/tracker/index.php?func=detail&atid=106208&aid=2063064
&group_id=6208

but pleese just mention where you found a bug,
or shall we guess it ?
_
wolfgang

From: Chuck Crayne on 20 Aug 2008 22:42

On Wed, 20 Aug 2008 17:25:27 -0400
"Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote:

> Four guys who develop NASM reading a.l.a. and I have to file a bug
> report?

Only because it would appear that you have not yet convinced any of us
that you have a valid case, and the bug report gives you access to a
larger audience.

As I understand your case, you have stipulated that the Intel manuals
use the Ew notation for the operand type, which means that the mode bit
is ignored, and the operand is always 16-bits. However, you then seem
to argue that Intel was correct only in respect to memory operands,
and should have had a separate entry for register operands,
with an Ec notation, along with a footnote that the mode bit is also
ignored in this instance.

You have also stated that "only 8-bit or 32-bit registers should
be displayed for 32-bit mode, and 8-bit or 32-bit registers for 16-bit
mode.", which (ignoring the obvious typo) seems to imply that one
cannot use 16-bit registers in 32-bit mode.

As for myself, I can't get particularly excited about either side of
the discussion. Since there is such a difference of opinion on what
assembly language syntax should look like, all I expect of the NASM
disassembler is that it generate code which NASM will accept. If you
can show a test case in which this is not true, then I will be
motivated to do something about it.

--
Chuck
http://www.pacificsites.com/~ccrayne/charles.html

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Prev: announce: my very first disassembler now available (GPL)
Next: Win32 non blocking console input?