From: NathanCBaker on 12 Aug 2008 01:14 On Aug 12, 12:56 am, Willow <wrschlan...(a)gmail.com> wrote: > On Aug 11, 7:41 pm, NathanCBa...(a)gmail.com wrote:> On Aug 11, 11:08 pm, Willow <wrschlan...(a)gmail.com> wrote: > > > > Now that you can actually run it... what do you think? > > > I like it. But I'd like the columns to be closer together. Here is > > what I tested it with: > > Which columns? From the script file? > In the output. I suggest shortening the space between the "00000100" and the "mov" and the "ax,0x0013" so that it is easier on the eyes in a monospaced font. > > I modified the script file so it now produces this output for the same > input (notice most of the redundant sizes are no longer there): > > 00000100 mov ax,0x0013 > 00000103 int 0x10 > 00000105 mov di,0xa000 > 00000108 mov es,di > 0000010a mov di,0x7c5f > 0000010d mov byte [es:di],0x0f > 00000111 mov ah,0x00 > 00000113 int 0x16 > 00000115 cmp ah,0x01 > 00000118 je short 0x0144 > 0000011a cmp ah,0x1f > 0000011d jne short 0x0124 > 0000011f mov ax,0x0013 > 00000122 int 0x10 > 00000124 cmp ah,0x1e > 00000127 jne short 0x012d > 00000129 sub di,0x0140 > 0000012d cmp ah,0x2c > 00000130 jne short 0x0136 > 00000132 add di,0x0140 > 00000136 cmp ah,0x33 > 00000139 jne short 0x013c > 0000013b dec di > 0000013c cmp ah,0x34 > 0000013f jne short 0x0142 > 00000141 inc di > 00000142 jmp short 0x010d > 00000144 mov ax,0x0003 > 00000147 int 0x10 > 00000149 ret word 0x0000 That is better. Looking good! I think 'crudasm' has potential to give 'ndisasm' some serious competition. But, shh, don't let anyone on the Nasm team know that I said that. :) Nathan.
From: Alexei A. Frounze on 12 Aug 2008 01:47 On Aug 11, 7:05 pm, Willow <wrschlan...(a)gmail.com> wrote: > I just finished my very own disassembler, written from scratch. It > takes a 750-line input script file that specifies the x86 and x86-64 > instruction set, and produces a disassembler. Unlike other > disassemblers, mine is enjoyable to work on because it is coherent, Nope, because ... it's yours :) > you have a script file that makes sense (to me at least :-) rather > than a bunch of incoherent and often buggy opcode tables copied from > an Intel manual. You have AMD manuals too. You can always crosscheck them. > You should check it out and let me know what you think! Not too bad. Although I wouldn't throw exceptions in a disassembler and print errors (IMO, the errors should translate to opcode bytes and question marks (for the instruction) and error return codes, which then the caller can use as they wish (continue disassembly or print an error or whatever)). I would also avoid unreadable things like x86c_decoder_table[]. And... would make it output to a buffer first so the output can be reused immediately without the need to capture stdout. If you want this code to be reusable, it needs to have very clear APIs (especially for the input and output), and the logic that controls the disassembly engine should be flexible and controllable (or better yet specifiable) by the caller. Furthermore, I'd move the script into the main program so it doesn't get corrupted or exploited and the disassembler can be used in any environment (including debugger with possibly limited disk access (say, the debugger runs at a priority level higher than the disk interrupts and hence using the disk I/O would just hang the OS)). > It's called crudasm, the crude disassembler. Right now it only works > in 16 and 32 bit mode, and only supports raw binary files (e.g. no PE > etc. files). > > You can find it here:http://code.google.com/p/vm64dec/downloads/list > > I plan to update crudasm to make it more intelligent in the next > release. > In the future I will add floating point, MMX, SSE, etc. instructions > but they're not supported yet. I will also update the script file to > contain semantics not just syntax so the disassembler can be like > sourcer, e.g. it knows mov ax,5 loads ax to 5, etc. More intelligent? :) How much more? I think the first goal would be to make it disassembly everything and make the output reassembleable (does this word exist?:). Then you'd probably want to disambiguate the disassembly (maybe through a special command option) so that the reassembly gives you the exact binary that you disassembled. > Although I am proud of this, I'm sure you are. No doubt. > and I hope I don't get flamed for being a > newbie or something...It took a lot of work to get to this point. > Hopefully it's all downhill from here. You're probably wondering, why > another diassembler? There is no good reason, I wrote this for the > experience of developing my own tool not because the world needs > another disassembler. Good that you understand it. :) > Mine is not as good as the one that comes with > nasm (less opcodes) or anything but it's my very own program! Yep. Compare them against yours. > If you do download it, check out x86c/script.txt and let me know what > you think... if you have any questions about what the fields mean (the > script is a space-separated list) then ask me. Alex
From: Wolfgang Kern on 12 Aug 2008 06:07 Willow posted: > I just finished my very own disassembler, written from scratch. It > takes a 750-line input script file that specifies the x86 and x86-64 > instruction set, and produces a disassembler. Are you sure to cover all instructions incl 64's with 750 lines ? I never counted all prototype variations, I think there are much more. > Unlike other disassemblers, mine is enjoyable to work on because it > is coherent, you have a script file that makes sense (to me at least :-) Fine, my own 'DisAss' (see HEXTUTOR) uses replaceable tables instead, so it can be used for other architectures too. > rather than a bunch of incoherent and often buggy opcode tables copied > from an Intel manual. AMD manuals are the better source for this. > You should check it out and let me know what you think! > It's called crudasm, the crude disassembler. Right now it only works > in 16 and 32 bit mode, and only supports raw binary files (e.g. no PE > etc. files). > You can find it here: http://code.google.com/p/vm64dec/downloads/list Oh, unzipped 2.3 MB show immediate that it's written HLL-styled :) > I plan to update crudasm to make it more intelligent in the next > release. > In the future I will add floating point, MMX, SSE, etc. instructions > but they're not supported yet. I will also update the script file to > contain semantics not just syntax so the disassembler can be like > sourcer, e.g. it knows mov ax,5 loads ax to 5, etc. When I look at the ins/...c, is your final target a C-resourcer ? > Although I am proud of this, and I hope I don't get flamed for being a > newbie or something... We all are proud of our own work :) > It took a lot of work to get to this point. I can confirm this from own experience. > Hopefully it's all downhill from here. You're probably wondering, why > another diassembler? There is no good reason, I wrote this for the > experience of developing my own tool not because the world needs > another disassembler. Mine is not as good as the one that comes with > nasm (less opcodes) or anything but it's my very own program! Yeah, and a very good method for learning CPU internals anyway. > If you do download it, check out x86c/script.txt and let me know what > you think... if you have any questions about what the fields mean (the > script is a space-separated list) then ask me. Seems you took a similar approach as I started my DisAss, with enough information to later feed a fully automated but static code analyser. Not to pick on your code style, but my whole disassembler core is shorter (20 Kbyte machine code incl.tables, FPU/XMM/MMX/SSE2, text- buffers, register values/stack-trace and a detailed info struct) than your 27 KB script :) __ wolfgang
From: Karel Lejska on 12 Aug 2008 09:22 On Aug 12, 4:05 am, Willow <wrschlan...(a)gmail.com> wrote: > I just finished my very own disassembler, written from scratch. It > takes a 750-line input script file that specifies the x86 and x86-64 > instruction set, and produces a disassembler. Unlike other > disassemblers, mine is enjoyable to work on because it is coherent, > you have a script file that makes sense (to me at least :-) rather > than a bunch of incoherent and often buggy opcode tables copied from > an Intel manual. > > You should check it out and let me know what you think! > It's called crudasm, the crude disassembler. Right now it only works > in 16 and 32 bit mode, and only supports raw binary files (e.g. no PE > etc. files). > > You can find it here:http://code.google.com/p/vm64dec/downloads/list > > I plan to update crudasm to make it more intelligent in the next > release. > In the future I will add floating point, MMX, SSE, etc. instructions > but they're not supported yet. I will also update the script file to > contain semantics not just syntax so the disassembler can be like > sourcer, e.g. it knows mov ax,5 loads ax to 5, etc. > > Although I am proud of this, and I hope I don't get flamed for being a > newbie or something...It took a lot of work to get to this point. > Hopefully it's all downhill from here. You're probably wondering, why > another diassembler? There is no good reason, I wrote this for the > experience of developing my own tool not because the world needs > another disassembler. Mine is not as good as the one that comes with > nasm (less opcodes) or anything but it's my very own program! > > If you do download it, check out x86c/script.txt and let me know what > you think... if you have any questions about what the fields mean (the > script is a space-separated list) then ask me. Hi Willow, so you developed another format to store the information? If you know XML/XSL, you could consider generating the disassembler out of this file: http://ref.x86asm.net/x86reference.xml I spent quite a lot time with this. Or you could be interested in the HTML editions at least: http://ref.x86asm.net/coder.html http://ref.x86asm.net/geek.html They should answer all your questions regarding validity/support for any opcode.
From: Herbert Kleebauer on 12 Aug 2008 12:59
Willow wrote: > I just finished my very own disassembler, written from scratch. It > > You should check it out and let me know what you think! Just dissasembled the different addressing modes, but you generate the same source for different binaries: not word [es:bx+0x0002] 26 f7 57 02 not word [es:bx+0x0002] 26 f7 97 0002 not word [es:dword eax+0x00000002] 67 26 f7 50 02 not word [es:dword eax+0x00000002] 67 26 f7 90 00000002 I think it would be better to use: not word [es:bx+0x02] 26 f7 57 02 not word [es:bx+0x0002] 26 f7 97 0002 not word [es:dword eax+0x02] 67 26 f7 50 02 not word [es:dword eax+0x00000002] 67 26 f7 90 00000002 not al f6 d0 not.b r0 not ah f6 d4 not.b m0 not ax f7 d0 not.w r0 not eax 66 f7 d0 not.l r0 not byte [es:0x0064] 26 f6 16 0064 not.b 100{s1} not word [es:bx] 26 f7 17 not.w (r3.w){s1} not dword [es:si] 66 26 f7 14 not.l (r5.w){s1} not byte [es:di] 26 f6 15 not.b (r6.w){s1} not word [es:bx+si] 26 f7 10 not.w (r3.w,r5.w){s1} not dword [es:bx+di] 66 26 f7 11 not.l (r3.w,r6.w){s1} not byte [es:bp+si] 26 f6 12 not.b (r4.w,r5.w){s1} not word [es:bp+di] 26 f7 13 not.w (r4.w,r6.w){s1} not word [es:bx+0x0002] 26 f7 57 02 not.w 2.b(r3.w){s1} not word [es:si+0x0002] 26 f7 54 02 not.w 2.b(r5.w){s1} not word [es:di+0x0002] 26 f7 55 02 not.w 2.b(r6.w){s1} not word [es:bx+si+0x0002] 26 f7 50 02 not.w 2.b(r3.w,r5.w){s1} not word [es:bx+di+0x0002] 26 f7 51 02 not.w 2.b(r3.w,r6.w){s1} not word [es:bp+si+0x0002] 26 f7 52 02 not.w 2.b(r4.w,r5.w){s1} not word [es:bp+di+0x0002] 26 f7 53 02 not.w 2.b(r4.w,r6.w){s1} not word [es:bx+0x0002] 26 f7 97 0002 not.w 2(r3.w){s1} not word [es:si+0x0002] 26 f7 94 0002 not.w 2(r5.w){s1} not word [es:di+0x0002] 26 f7 95 0002 not.w 2(r6.w){s1} not word [es:bx+si+0x0002] 26 f7 90 0002 not.w 2(r3.w,r5.w){s1} not word [es:bx+di+0x0002] 26 f7 91 0002 not.w 2(r3.w,r6.w){s1} not word [es:bp+si+0x0002] 26 f7 92 0002 not.w 2(r4.w,r5.w){s1} not word [es:bp+di+0x0002] 26 f7 93 0002 not.w 2(r4.w,r6.w){s1} not word [es:dword 0x00000064] 67 26 f7 15 00000064 not.w 100.l{s1} not word [es:dword eax] 67 26 f7 10 not.w (r0){s1} not word [es:dword eax+0x00000002] 67 26 f7 50 02 not.w 2.b(r0){s1} not word [es:dword eax+0x00000002] 67 26 f7 90 00000002 not.w 2(r0){s1} not word [dword eax+edx] 67 f7 14 10 not.w (r0,r1) not word [dword eax+edx*2] 67 f7 14 50 not.w (r0,r1*2) not word [dword eax+edx*4] 67 f7 14 90 not.w (r0,r1*4) not word [dword eax+edx*8] 67 f7 14 d0 not.w (r0,r1*8) not word [dword eax+0x00000002] 67 f7 90 00000002 not.w 2(r0) not word [dword eax*2+0x00000002] 67 f7 14 45 00000002 not.w 2(r0*2) not word [dword eax*4+0x00000002] 67 f7 14 85 00000002 not.w 2(r0*4) not word [dword eax*8+0x00000002] 67 f7 14 c5 00000002 not.w 2(r0*8) not word [dword eax+edx+0x00000002] 67 f7 54 10 02 not.w 2.b(r0,r1) not word [dword eax+edx*2+0x00000002] 67 f7 54 50 02 not.w 2.b(r0,r1*2) not word [dword eax+edx*4+0x00000002] 67 f7 54 90 02 not.w 2.b(r0,r1*4) not word [dword eax+edx*8+0x00000002] 67 f7 54 d0 02 not.w 2.b(r0,r1*8) not word [dword eax+edx+0x00000002] 67 f7 94 10 00000002 not.w 2(r0,r1) not word [dword eax+edx*2+0x00000002] 67 f7 94 50 00000002 not.w 2(r0,r1*2) not word [dword eax+edx*4+0x00000002] 67 f7 94 90 00000002 not.w 2(r0,r1*4) not word [dword eax+edx*8+0x00000002] 67 f7 94 d0 00000002 not.w 2(r0,r1*8) |