Prev: announce: my very first disassembler now available (GPL)
Next: Win32 non blocking console input?
From: H. Peter Anvin on 2 Sep 2008 01:44 Frank Kotler wrote: > > Okay... Now I see the int 80h... So if I've got the "second simplest" C > program - instead of returning from "main", it does "exit(42);"... this > goes through "__syscall_common" (arch/i386/syscall.S - in the library)? > This copies our parameter into ebx, and the sys_call number into eax, > and does int 80h, I guess. > > Maybe I'm confused about this, too, but I think the int 80h vector takes > us into entry.S (arch/i386/entry.S - in the kernel code)? This looks > like it eventually does "call [sys_call_table + eax * 4]" (after putting > ebx back on the stack...). In this case, that's "sys_exit" - in > kernel/exit.c (?). That shifts our parameter up by 8, and calls > "do_exit". I'm guessing that it's the call to "schedule()" after marking > our task "dead" that actually makes us go away...(?). > > Whew! I was under the impression that the C library called "sys_exit" > (say) directly, without going through the int 80h rigamarole. I was > mistaken! > Indeed. sys_exit() is a function in kernel space... the *ONLY* ways you can call a kernel space function from user space is by doing a system call. On i386, you can do that via int $0x80, or by calling into the vdso (a piece of memory managed by the kernel, but available readonly to userspace), where the kernel will have placed either "int $0x80", "sysenter" or "syscall", depending on what the fastest way to do a system call is on that particular CPU. On x86-64, the vdso will typically contain the syscall instruction, but some system calls (like gettimeofday) are actually implemented fully in user space using kernel-managed tables under certain circumstances, which are controlled by the kernel. > So there *is* an "advantage" to using the int 80h interface ourselves? Sort of - kind of. First of all, you lose the advantage of the vdso (sysenter is usually a lot faster than int $0x80 on the CPUs that have it, but because it loses the return address, it *must* be called from the vdso and not directly.) Second, you lose the functionality that is provided by the C library, some of which is more of a system call layer. There is some preliminary discussions about splitting glibc into libc, which contains the C runtime, and libkernel, which would be maintained by the kernel team and contain the code immediately surrounding the system calls. > This brings us back to Herbert's question (sorry, Herbert... and Rod... > you were right), where do library developers find documentation on the > int 80h interface? Where did you "first" learn this stuff, Peter? Well, I have done Linux kernel development for 16 years and so I'm quite familiar with the code; I have been involved in architecting some of it and so forth. That being said, the three main sources of documentation are the kernel source itself, man pages section 2, and the ELF psABI for the specific architecture. When I wrote klibc I specifically wanted to minimize the effort to port it between Linux architectures, and so I ended up abstracting out as much as possible. This makes the klibc code hopefully useful as a secondary source. -hpa
From: Rod Pemberton on 3 Sep 2008 09:47 "Herbert Kleebauer" <klee(a)unibwm.de> wrote in message news:48B4540F.25C76D0C(a)unibwm.de... > Rod claimed, that "lsl eax, ebx" should be used instead of > "lsl eax, bx" because in 32 bit mode 32 bit registers are used > by default. As have all of you! Do you remember saying this? Herbert Kleebauer: "I don't see any difference in ldsl and for example the add instruction." Frank Kotler: "... in the reg, reg forms [of LSL], the size of the registers must match." Chuck Crayne: "...disassembler might be show case 1 as lsl eax,ebx [instead of lsl eax,bx]." Randall Hyde: "...you specify the 16-bit version in a 16-bit segment and the 32-bit version in a 32-bit segment." Those were from 2006 on clax: http://groups.google.com/group/comp.lang.asm.x86/browse_thread/thread/15837b30374c5746/a13111f0554abe81?#a13111f0554abe81 > Frank proofed[sic], that when "lsl" is used with a memory > operand only a 16 bit memory access occurs Frank/Chuck/et al. proved on newer processors that the instruction uses 16-bits for the memory read. This is what the current Intel/AMD manuals state. They didn't prove that older processors don't read/write 32-bits from memory as their older manuals describe. Nor, did they prove anything about the size of the source register which was the entire issue. But, even if one could prove that all cpu's only accessed a 16-bit register, it still has nothing to do with the fact that the assembly syntax should be 32-bit for 32-bit mode for these two instructions as all manuals indicate. > and it doesn't make > much sense to claim that in the register case the full 32 bit > register is accessed and the higher halve then is discarded. What claim? This is _exactly_ what is stated in the Intel manuals from at least 2003 to 2008 (and likely longer). Under LAR/LSL: "1. For all loads (regardless of source or destination sizing) only bits 16-0 are used. Other bits are ignored." So, you're saying that the Intel manuals have been continuously wrong for at least five years?... They've rewritten the manuals over and over and produced new manuals for 64-bit, but haven't noticed this specific error in half a decade? To me, "it doesn't make much sense to claim that in that in the register case the full 32 bit register" isn't accessed since the 386 manual explicitly indicates this is so... Rod Pemberton PS. Who is Nasm64developer?
From: Frank Kotler on 3 Sep 2008 13:47 Rod Pemberton wrote: .... > Frank Kotler: > "... in the reg, reg forms [of LSL], the size of the registers > must match." Immediately preceded by "Nasm does it differently"... If you continue down this thread, you'll see that the Nasm development team concluded that this was a "bug": -------------------- Nasm64developer has submitted a patch to "correct" Nasm's behavior. With this patch applied, "lsl eax, bx" assembles correctly, but "lsl eax, ebx" is an error... breaking existing code... if any... --------------------- (current versions accept either, disassemble to "bx") .... > Frank/Chuck/et al. proved on newer processors that the instruction uses > 16-bits for the memory read. This is what the current Intel/AMD manuals > state. They didn't prove that older processors don't read/write 32-bits from > memory as their older manuals describe. Nor, did they prove anything about > the size of the source register which was the entire issue. This is true. (Merely out of curiosity, I'd like to see it done on an older processor. Doesn't "make sense" to me that it would read 32 bits, but maybe... I don't think "must make sense to Frank" is in Intel's design criteria... I'd bet on "error in manual"... but I wouldn't bet more than I could afford to lose...) .... > PS. Who is Nasm64developer? I wish I could tell ya, but he wishes to remain anonymous. You asked earlier in this thread if I believed this "just because Nasm64developer said so". Yes, pretty much... For those who don't follow the Nasm bugtracker, here's Nasm64developer's latest entry in response to your bug report: ------------------------- While the dst register follows opsize (i.e. is affected by mode, 66h, and REX.W), only the lowest 16 bits of the src operand are accessed. (In case of memory, one can prove it by means of #DB, #GP, or #PF. In case of a register, it's impossible to see the difference, unless the processor has a partial register access stall.) That is, the src operand is not affected by opsize. The assembler permits 16/32/64-bit src registers -- purely for courtesy reasons. What the disassembler should emit, is a matter of debate. Currently it picks the first of the 3 insns.dat entries -- that is, the one with the 16-bit src register. This is an intentional choice on my part; it reflects the Ew notation which is used as the most concise form of representation. Various manuals do suggest that src reg follow the size of dst reg. If you prefer that choice, despite the fact that it doesn't reflect the instruction's actual operation, you can easily shuffle the ND flag of the reg,reg entries. If they work as advertised, they will suppress disassembly of an entry. -------------------------------------- (doesn't quite say what will happen if the ND flag *doesn't* work as advertised... probably won't make smoke... :) I've been arguing the other side of this, but I should say that I think your viewpoint has a good deal of merit as well. That *is* how we would normally disassemble "those bits", and there's something to be said for doing it the same way all the time. I just think it's "even better" to treat lsl/lar as an exception to the general rule. Since only 16 bits are *used* (for src), we should only "say" 16 bits... Wouldn't break much (any) of *my* code if Nasm left out lar/lsl entirely, how 'bout you? If it *is* a bug, seems like a pretty harmless one. Best, Frank
From: Herbert Kleebauer on 3 Sep 2008 15:24 Rod Pemberton wrote: > "Herbert Kleebauer" <klee(a)unibwm.de> wrote in message > > Rod claimed, that "lsl eax, ebx" should be used instead of > > "lsl eax, bx" because in 32 bit mode 32 bit registers are used > > by default. > As have all of you! What have we all claimed? That in 32 bit mode 32 bit registers are used by default (which surely is true) or that therefore "lsl eax, ebx" should be used instead of "lsl eax, bx" (which surely is not true). > Do you remember saying this? > Herbert Kleebauer: > "I don't see any difference in ldsl and for example the > add instruction." I don't understand what this quote has to do with it. Here the full quote (with the corrected opcodes): ==================================================================== I don't understand what you want to say. I don't see any difference in ldsl and for example the add instruction. If you want a 32 bit operation in a 16 bit segment or a 16 bit operation in a 32 bit segment, then you have to prefix the instruction by 66. seg16 00000000: 0f 03 d0 ldsl.w r0,r1 00000003: 66 0f 03 d0 ldsl.l r0,r1 00000007: 01 c2 add.w r0,r1 00000009: 66 01 c2 add.l r0,r1 seg32 0000000c: 66 0f 03 d0 ldsl.w r0,r1 00000010: 0f 03 d0 ldsl.l r0,r1 00000013: 66 01 c2 add.w r0,r1 00000016: 01 c2 add.l r0,r1 And for an address size switch you have to use 67: seg16 00000000: 0f 03 07 ldsl.w (r3.w),r0 00000003: 67 0f 03 03 ldsl.w (r3.l),r0 00000007: 66 0f 03 07 ldsl.l (r3.w),r0 0000000b: 67 66 0f 03 03 ldsl.l (r3.l),r0 seg32 00000010: 67 66 0f 03 07 ldsl.w (r3.w),r0 00000015: 66 0f 03 03 ldsl.w (r3.l),r0 00000019: 67 0f 03 07 ldsl.l (r3.w),r0 0000001d: 0f 03 03 ldsl.l (r3.l),r0 ==================================================================== This says, that with ldsl you can use the data and address size prefix the same way as with the add instruction. In a 32 bit segment the data size prefix makes the operation a 16 bit operation which only modifies the lower halve of the result register. But in any case, the source operand is a 16 bit operand, independent of any size prefix. > > Frank proofed[sic], that when "lsl" is used with a memory > > operand only a 16 bit memory access occurs > > Frank/Chuck/et al. proved on newer processors that the instruction uses > 16-bits for the memory read. This is what the current Intel/AMD manuals > state. They didn't prove that older processors don't read/write 32-bits from > memory as their older manuals describe. I checked it on a 845 MHz Celeron which surely isn't a newer processor. > Nor, did they prove anything about > the size of the source register which was the entire issue. But, even if > one could prove that all cpu's only accessed a 16-bit register, it still has You can't prove this because there is absolutely no difference between only accessing the lower halve of the register and accessing the the full register and then discarding the upper halve. > nothing to do with the fact that the assembly syntax should be 32-bit for > 32-bit mode for these two instructions as all manuals indicate. The manual describes what happens: a 16 bit selector stored in the lower halve of a general purpose register is used to access a segment descriptor, extract the 32 bit size of the segment from this descriptor and store this 32 bit value in the target register. Now the writer of the assembler has to find a symbolic representation for this instruction. And if I'm asked which one of these two lines lsl eax, bx lsl eax, ebx better describes the instruction, then I surely would select the first one. > > and it doesn't make > > much sense to claim that in the register case the full 32 bit > > register is accessed and the higher halve then is discarded. > > What claim? This is _exactly_ what is stated in the Intel manuals from at > least 2003 to 2008 (and likely longer). Under LAR/LSL: > > "1. For all loads (regardless of source or destination sizing) only bits > 16-0 are used. Other bits are ignored." > > So, you're saying that the Intel manuals have been continuously wrong for at > least five years?... They've rewritten the manuals over and over and > produced new manuals for 64-bit, but haven't noticed this specific error in > half a decade? To me, "it doesn't make much sense to claim that in that in > the register case the full 32 bit register" isn't accessed since the 386 > manual explicitly indicates this is so... The manual describes the effect of the instruction (and both versions are a correct description) but we are talking about a proper symbolic representation of the instruction. There is also an instruction which sign extends a 16 bit register (or if you like, the lower halve of a 32 bit register) to a 32 bit value and store it in a register: I use: 0f b7 c3 movu.wl r3,r0 Now, which one do you prefer in your syntax: movsx eax,bx movsx eax,ebx I suppose the second one because this is the same as with the lsl instruction. But we also can signed extend a byte value: 0f b6 c3 movu.bl r3,r0 movsx eax,bl movsx eax,ebx With the same argument you also have to chose the second one here, but you can't use the same symbolic representation for different instructions.
From: Wolfgang Kern on 3 Sep 2008 15:42
Frank Kotler answered Rod Pemberton: ....[still about Ceasar's Beard Rod ?] > (current versions accept either, disassemble to "bx") Yeah, and my solution on this may confirm that NASM is right on: LSL eax,bx because LSL/LAR eax,ebx wont contain any sense anyway. And I'd see everything different to the LSL eax,bx ;load eax with the segm4ent limit of the segment ;given in BX (and nothting else) interpetation as a bug. What do you expect with: LSL eax,ebx to be different from LSL eax,bx the first above may lead to a (totally wrong) believe that the highword in ebx may change any functionality of LSL/LAR or friends. Ok if it's just a matter of the targeted assembler dialect, this may be job of the disassembler option flags, and nothing more... Ok Rod, you're known and I appreciate your abilities to detect misinterpretations/inconsequent or contradictional wording much faster that we oldies may see it, but here: LSL/LAR and any other seg-mov/modify, you are just plain wrong. SEG-reg-OPCODES behave really different to GP/MEM instructions by their (vectored instead of pipelined) nature and please stop to tell us about REG-Address-Size defaults, because they aren't valid on "specials". __ wolfgang |