Prev: announce: my very first disassembler now available (GPL)
Next: Win32 non blocking console input?
From: Rod Pemberton on 19 Aug 2008 06:25 "Frank Kotler" <fbkotler(a)verizon.net> wrote in message news:g8d9fm$pfq$1(a)aioe.org... > Rod Pemberton wrote: > > "Frank Kotler" <fbkotler(a)verizon.net> wrote in message > > news:g8an22$v6v$1(a)aioe.org... > >> Interesting. What would be the "meaning" of a (16-bit!) selector in a > >> 32-bit reg? > > > > Is this a trick question? > > Not intended to be. OK. The meaning of a 16-bit selector in a 32-bit reg is the same as the meaning of a 16-bit selector in a 16-bit reg... (How was that not a trick question again?) > >> If the upper bits are "garbage", it won't work? > > > > Are you implying that 16-bit selectors in 32-bit mode as implemented > > currently don't actually work? > > I certainly didn't intend to imply that! Then, it must've been a trick question... since there is no difference in meaning, usage, or implementation of a 16-bit selector in a 32-bit reg, AFAICT, and you didn't intend to imply that. > Let me try this again... Assume 32-bit code... > > You write: > > lsl eax, ebx > > I write: > > lsl eax, bx > > What's the difference? One is a valid 32-bit instruction. One isn't AFAICT (even with overrides...). > Same machine code, Where do you get this? I posted the encoding information from _four_ respectable manuals. LSL r32,r16 isn't present anywhere. (I granted you that interpretation for one manual, remember?) At a minimum, "lsl eax,bx" would require an override for 32-bit code to generate "bx" instead of "ebx" (but that would also require a change to "eax"), i.e., the code must be different. (If you don't agree to this interpretation, then Ndisasm's decoding of override prefixes is broken for all instructions other than lsl and lar. Because, that's how they all work.) > >> This looks > >> like a change from m32 to m16 somewhere between 2003 and 2006. > > > > Little endian with 16-bit selectors. Basically, irrelevant if the cpu > > ignores retrieving the upper 16-bits of a 32-bit value from memory... > > ... Ah, but here it *does* make a difference if the instruction reads 32 > bits and discards 16! Why? It doesn't make a difference under normal circumstances. It only exhibits a problem under special circumstances: misaligned memory read split across a boundary of unmapped memory. > ... Ah, but here it *does* make a difference if the instruction reads 32 > bits and discards 16! Only if the 16 bits in question is the last 16 > bits of readable memory, to be sure... What does "little endian" mean to you? In memory, little endian 16-bit value 0x1100 is stored (with 0x00 byte at the lower address): 00 11 In memory, little endian 32-bit value 0x33221100 is stored: 00 11 22 33 I.e., a valid read (no faults...) from the stored address of 16-bits and of 32-bits which discards the upper 16-bits will both have the lower 16-bits be 0x1100. > This is the experiment that Phil > proposed. On the only processor I've tested it on - P4 - it definitely > reads only 16-bits. How does one align a 32-bit general purpose register on an unmapped memory boundary to generate an alignment trap to determine if half of or the whole of a 32-bit register is used? (This is a trick question.) > >> or did they fix an error in the manual? > > > > No, I don't believe so. > > If I see results of an *experiment* that shows lar/lsl reading 32 bits > with some CPU, I'll agree with you. That's an invalid conclusion. The conclusion only applies to the memory operand as source, not either of the register operands (destination or source). The size of the memory read has no effect on the size of the register used. You can see that LSL can use 32-bit registers for _both_, while accessing memory as 16-bits for the second, according to one of the manuals. Look: 0F 03 /r LSL r32, r32/m16 > >> I would "expect" the source operand to be 16-bits, > > > > ABSOLUTELY NOT! If the cpu is in 32-bit mode and the source operand is a > > register, the register size is either 32-bits or 8-bits. > > > >> regardless of the > >> processor mode or size of the destination register, a selector being 16 > >> bits. > > > > In 16-bit mode, you have two register sizes: 8-bit and 16-bit. In 32-bit > > mode, you have two register sizes: 8-bit and 32-bit. > > Are you implying that: > > lsl ax, bx > > won't work in 32-bit code? No, I'm not. IMO, you seem to be confusing/merging the encoding/decoding and instruction operation. The question is either: "Is lsl ax,bx encoded as 16-bits, then executed as 32-bits?" *OR* "Is lsl ax,bx encoded in 32-bit mixed mode and executed as 32-bits?" If "lsl ax, bx" was encoded as 16-bits, and then executed in 32-bit mode, it'll be executed as "lsl eax, ebx". It'll work. But, it won't work as coded. The fact that this instruction may have the identical operational result is irrelevant. If decoded as 32-bits, it should decode as "lsl eax, ebx" not "lsl eax, bx". If "lsl ax, bx" was encoded as 32-bits in mixed mode (i.e., with override), and then executed in 32-bit mixed mode, it'll be executed as the equivalent 16-bit code due to the override. > >> As Phil suggests, we could conduct the experiment. I have done so, > >> both loading upper bits of ebx with garbage, and putting a 16-bit > >> variable in the last two bytes of valid memory. I conclude that the > >> source operand is 16 bits... > > > > I think that's a wrong conclusion. You can conclude that only 16-bits of > > the source operand are used as a selector. But, you can't determine what > > size 32-bits or 16-bits was read in order to obtain the 16-bit selector. > > I certainly can! If it were reading 32 bits, it'd segfault! That needs to be qualified: "If it were reading 32 bits" [of memory, under special conditions] "it'd segfault!" So, no, you can't legitimately make that claim since we're also dealing with register reads not just memory reads. You're willfully ignoring a register source operand and assuming that what's true for memory source operand size is also true for the register source operand size. > > My whole point is how do you get "bx", a 16-bit register, instead of "ebx" > > for an instruction decode which can only return an 8-bit or 32-bit register? > > You got me! Why are you surprised? That is what Ndisasm is doing to lsl, and lar decodes... It's emitting a 16-bit register for a 32-bit one, erroneously. > What CPU would that be??? AFAIK, we've been talking about Ndisasm instruction decoding, not CPU's in general nor a specific one. Rod Pemberton
From: Rod Pemberton on 19 Aug 2008 06:25 "Frank Kotler" <fbkotler(a)verizon.net> wrote in message news:g8ddve$oqf$1(a)aioe.org... > > "Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote in message > > I don't believe the decode for lsl, and > > lar should be treated the same, i.e., using a fixed 16-bit register, as > > those for arpl, lldt, lmsw, ltr, str, verr, and verw. > > > > AFAICT, although there are some instructions fixed to 16-bit registers only, > > there currently is no 32-bit instruction that uses _both_ a 32-bit register > > and a 16-bit register as Ndisasm is "doing to" lar and lsl... > > mov eax, ds ? That's not against a GP register. It's against a segment register which is 16-bits. > mov ax, ds > mov eax, ds ; two different instructions! Please post the manual that has an encoding for "mov r32,Sreg". (I.e., AFAICT r16 is the norm and other sizes are only listed as valid for AMD64's...) > I really think lar/lsl are the "same idea". From 8E D8 I would > disassemble (in 32-bit mode) ax, not eax. And from 0F 03 C3 I would > disassemble bx, not ebx. Since it doesn't make the slightest difference, But, it does make a difference. Think about someone who is using NASM's macros, e.g., for a register allocator. For 32-bit mode code, the macros are going to use an either an 8-bit or a 32-bit register. The macros aren't going to use 16-bit registers. By implementing the instruction decode/encode as 16-bits for a GP register, because it's "irrelevant", you just broke the use of macros for that instruction. > How would you disassemble EC? "in al, edx"??? IN AL,DX DX is hardcoded. DX is part of the instruction operation, but it's not part of the instruction encoding/decoding. Rod Pemberton
From: Wolfgang Kern on 19 Aug 2008 11:10 Rod Pemberton wrote: .... > That needs to be qualified: "If it were reading 32 bits" [of memory, under > special conditions] "it'd segfault!" So, no, you can't legitimately make > that claim since we're also dealing with register reads not just memory > reads. You're willfully ignoring a register source operand and assuming > that what's true for memory source operand size is also true for the > register source operand size. about Caesars Beard Rod ? :) We proved it for memory operands. And I really wont care if the CPU internal reads and truncates because we wont see any effect on this. But when I look at the CPU's decoder there is never a operand size difference between mod3 and mod0..2 as long it's not another function. The only exception is PUSH/POP-seg with the stackpointer advance. I checked it on AMD K7 K8 and DX486: pm32: mov ebx,0x98760038 ;0038 is unlimited data 0f 03 c3 lsl eax,ax ;I get eax = 0xffffffff mov eax,0xabcd0038 66 0f 03 c3 lsl ax,bx ;eax = 0xabcdffff after this pm16: ;DS=0008 mov [0xfffe],0x0038 ;0008 is 16 bit data (limit: 0xffff) 66 0f 03 06 fe ff lsl eax, [0xfffe] ;eax = again 0xffffffff ;! no segfault ! > Why are you surprised? That is what Ndisasm is doing to lsl, and lar > decodes... It's emitting a 16-bit register for a 32-bit one, erroneously. Yeah, for PM32 LAR/LSL there shouldn't be a 66h in the assembler. But I see no error in LSL eax,bx perhaps just because my disass also show it this way ? :) __ wolfgang
From: Wolfgang Kern on 20 Aug 2008 17:48 Rod Pemberton wrote: > I'm still not sure what the justification for that tenacious belief is... Yes Rod, you have to say what you found in conflict with the manuals... > http://sourceforge.net/tracker/index.php?func=detail&atid=106208&aid=2063064 &group_id=6208 but pleese just mention where you found a bug, or shall we guess it ? _ wolfgang
From: Chuck Crayne on 20 Aug 2008 22:42
On Wed, 20 Aug 2008 17:25:27 -0400 "Rod Pemberton" <do_not_have(a)nohavenot.cmm> wrote: > Four guys who develop NASM reading a.l.a. and I have to file a bug > report? Only because it would appear that you have not yet convinced any of us that you have a valid case, and the bug report gives you access to a larger audience. As I understand your case, you have stipulated that the Intel manuals use the Ew notation for the operand type, which means that the mode bit is ignored, and the operand is always 16-bits. However, you then seem to argue that Intel was correct only in respect to memory operands, and should have had a separate entry for register operands, with an Ec notation, along with a footnote that the mode bit is also ignored in this instance. You have also stated that "only 8-bit or 32-bit registers should be displayed for 32-bit mode, and 8-bit or 32-bit registers for 16-bit mode.", which (ignoring the obvious typo) seems to imply that one cannot use 16-bit registers in 32-bit mode. As for myself, I can't get particularly excited about either side of the discussion. Since there is such a difference of opinion on what assembly language syntax should look like, all I expect of the NASM disassembler is that it generate code which NASM will accept. If you can show a test case in which this is not true, then I will be motivated to do something about it. -- Chuck http://www.pacificsites.com/~ccrayne/charles.html |