From: H. Peter Anvin on 20 Aug 2008 20:43 Alexei A. Frounze wrote: >> >> Plain 90h, which would normally be XCHG EAX,EAX (and therefore >> zero-extend EAX into RAX) is actually NOP. > > I might have misspoken, but the appropriate REX followed by 0x90 isn't > a NOP, it's XCHG rAX, r8. > 90 = nop = nop [would have been xchg eax,eax] 40 90 = rex nop = nop [would have been xchg eax,eax] 41 90 = rex.b nop = xchg eax,r8d 48 90 = rex.w nop = xchg rax,rax 49 90 = rex.wb nop = xchg rax,r8 -hpa
From: Willow on 21 Aug 2008 00:39 I have xchg eax,eax and xhg rax,rax as nop. Is this wrong? Below is the output from my new intelligent disassembler. It's called crudcom (Crude Decompiler -- that's what it aspires to be, not what it is!) and it's based on crudasm, only the output actually assembles this time! You must specify the entrypoint. It will flag with a star (*) comment anything it was unable to follow -- far control transfer instructions, indirect control transfer instructions, etc. You have to manually help it by provided a script file (currently unimplemented) when it flags something. crudcom uses fn_<address> and loc_<address> labels and I disassembled a real program and reassembled it--nasm shrank it a little bit but there were no assembler errors! (Unfortunately the resulting binary is worthless because offsets aren't exactly the same owing to the shrinkage). You can find crudcom1.exe and associated source code bundled (as well as the latest version of crudasm) in the latest version of VmDec here: http://code.google.com/p/vm64dec/downloads/list I also added semantics to the script file so data flow analysis can be added to later versions of crudcom (the intelligent disassembler / decompiler, successor to crudasm). For example, this is the semantics for test: asgn(x86_of, 0); asgn(x86_af, undefined); asgn(x86_cf, 0); asgn(tmp(result), bitand(arg(0), arg(1))); asgn(x86_sf, sign(tmp(result))); asgn(x86_zf, is_zero(tmp(result))); asgn(x86_pf, _x86_parity(trunc$byte(tmp(result)))); See x86s/in_script.txt and x86s/x86s_semantics.h for details. I have developed a program (not included currently) that converts 32- bit DLLs/EXEs into binary image files and provides a list of entrypoints, so crudasm/crudcom can be used on non-raw binary files. I want to incorporate a good loader into a future version that takes into account relocations and external invokations, e.g. so we get some 'off_<address>' labels, and so that e.g. a call to RtlSomethingOrAnother will be understood. I plan to add data flow analysis to crudcom2, following in the foosteps of dcc. It will still disassemble not decompile, but the comments will now report what registers and flags are input/output by any given function. So-called decompilable functions will be listed in depth-first-search order in the output assembly listing, so you can tell what's going on. That is what's planned for the future. crudcom3 will (supposedly) be the beginnings of a decompiler. It will actually generate C-style code, and will support loading of non-raw binary executable/shared libraries, and will support a helper script file for global data (you can't decompile a static linked list very easilly) and allow the user to specify other entrypoints (such as an array of function pointers). It will recognize indirect jumps (switch statements) automatically however. The output of crudcom3 will be generated from the semantics part of the script file. It will still make use of registers and flags, but crudcom4 will remove these and replace them with identifiers and expressions respectively. At least that's how I have it planned :-) I understand there are already some decompilers out there, but I wanted to make my own. Willow --- begin code --- org 0x100 ; Calls: fn_119 fn_100: jmp word loc_10d loc_103: db 0x48, 0x69, 0x21, 0xd, 0xa, 0x24 loc_109: int 0x21 loc_10b: jmp short loc_10b loc_10d: mov dx,0x0103 call word fn_119 mov ah,0x4c jmp short loc_109 loc_117: db 0xeb, 0xfe ; Calls: fn_119: mov ah,0x09 int 0x21 ret loc_11e: db 0xb8, 0xc3, 0x90, 0x90, 0x90, 0x90, 0xd5, 0x8, 0xd4, 0xa, 0xd9, 0xf4 , 0xe8, 0xfd, 0xff, 0xff loc_12e: db 0x10, 0xeb, 0xfe, 0xe9, 0xfd, 0xff, 0xff, 0x20
From: Alexei A. Frounze on 21 Aug 2008 01:38 On Aug 20, 5:43 pm, "H. Peter Anvin" <h...(a)zytor.com> wrote: > Alexei A. Frounze wrote: > > >> Plain 90h, which would normally be XCHG EAX,EAX (and therefore > >> zero-extend EAX into RAX) is actually NOP. > > > I might have misspoken, but the appropriate REX followed by 0x90 isn't > > a NOP, it's XCHG rAX, r8. > > 90 = nop = nop [would have been xchg eax,eax] > 40 90 = rex nop = nop [would have been xchg eax,eax] Yep. > 41 90 = rex.b nop = xchg eax,r8d Yep. > 48 90 = rex.w nop = xchg rax,rax Should be a NOP effectively. > 49 90 = rex.wb nop = xchg rax,r8 Same here. So, did you do the last two under a debugger? If you did, on what CPU brand? Intel, AMD or both? Alex
From: H. Peter Anvin on 21 Aug 2008 01:59 Alexei A. Frounze wrote: > >> 48 90 = rex.w nop = xchg rax,rax > > Should be a NOP effectively. > >> 49 90 = rex.wb nop = xchg rax,r8 > > Same here. Not a NOP, certainly... > > So, did you do the last two under a debugger? If you did, on what CPU > brand? Intel, AMD or both? > Not under a debugger, but yes, I executed them (inside a small C program). AMD Athlon X2 4200+ (socket 939). -hpa
From: Willow on 21 Aug 2008 02:05
On Aug 20, 9:38 pm, "Alexei A. Frounze" <alexfrun...(a)gmail.com> wrote: > On Aug 20, 5:43 pm, "H. Peter Anvin" <h...(a)zytor.com> wrote: > > > Alexei A. Frounze wrote: > > > >> Plain 90h, which would normally be XCHG EAX,EAX (and therefore > > >> zero-extend EAX into RAX) is actually NOP. > > > > I might have misspoken, but the appropriate REX followed by 0x90 isn't > > > a NOP, it's XCHG rAX, r8. > > > 90 = nop = nop [would have been xchg eax,eax] > > 40 90 = rex nop = nop [would have been xchg eax,eax] > > Yep. > > > 41 90 = rex.b nop = xchg eax,r8d > > Yep. > > > 48 90 = rex.w nop = xchg rax,rax > > Should be a NOP effectively. > > > 49 90 = rex.wb nop = xchg rax,r8 > > Same here. > This is where you're wrong. r8 is not rax, so xchg rax,r8 is not a no- op instruction! > So, did you do the last two under a debugger? If you did, on what CPU > brand? Intel, AMD or both? > > Alex Willow |