From: Wolfgang Kern on

"Rod Pemberton" mentioned:

.....
> ;These offsets should be the byte offset forms of mov...
> mov al,0x33 ;mov al,0x33
> mov ax,0x0033 ;mov ax,0x33
> mov eax,0x00000033 ;mov eax,0x33
> mov bl,0x33 ;mov bl,0x33
> mov bx,0x0033 ;mov bx,0x33
> mov ebx,0x00000033 ;mov ebx,0x33

MOV (w/dw)reg,imm doesn't exist in extended byte versions, these
are only available in the ADD..CMP group, IMUL and PUSH.

[call/jmp far ...]
Would NASM not understand it in my prefered order?
CALL FAR dword ...

[xchg...]
....
> xchg byte [0xbbee],al ;xchg al,[0xbbee]
....
Yes, but for memory operands there is only one opcode, so it wouldn't
make any difference. My disass also shows it as 'xchg [mem],reg'.
Only two byte Register XCHG codes are available for both directions,
so here the difference will affect the code checksum.


> ;unique decoding for string instructions
> cmps byte [si],byte [es:di] ;cmpsb
> cmps dword [si],dword [es:di] ;cmpsd
> cmps word [si],word [es:di] ;cmpsw
> ins byte [es:di],dx ;insb
> ins dword [es:di],dx ;insd
> ins word [es:di],dx ;insw
> lods byte [si] ;lodsb
> lods dword [si] ;lodsd
> lods word [si] ;lodsw
> movs byte [es:di],byte [si] ;movsb
> movs dword [es:di],dword [si] ;movsd
> movs word [es:di],word [si] ;movsw
> outs dx,byte [si] ;outsb
> outs dx,dword [si] ;outsd
> outs dx,word [si] ;outsw
> scas byte [es:di] ;scasb
> scas dword [es:di] ;scasd
> scas word [es:di] ;scasw
> stos byte [es:di] ;stosb
> stos dword [es:di] ;stosd
> stos word [es:di] ;stosw

Right, but I'd keep this additional information and put it after
a semicolon as a very convenient help comment.

__
wolfgang



From: Willow on
I just rewrote the decoder (the core of the disassembler). It's a lot
nicer now, check out the x86s/ directory (was called x86c/).

As for the disassembler, this is an application of the decoder, and
crudasm was designed from the beginning to be a "crude" throw-away
disassembler that has the ability to be good later on. I wrote it as a
learning experience, so I make cheap mistakes only.

Here's how it is supposed to work

machine code ---> icode --> assmebly language

The decoder core (x86s) reads machine code and produces intermediate
low-level code (icode). The icode contains the instruction number and
any arguments. Note that this process works in reverse too:

assmebly language --> icode --> machine code

That is how it works on paper. It's a bit more messy in practice !

I plan to add semantics to the script file -- right now it covers
disassembling (generating text from icode) and decoding icode from
machine code.

The idea is to build a tree (actually directed acyclic graph) as you
walk through code. An icode structure is only 28 bytes, so it's easy
to convert a whole binary image into icode if you can multiply the
file size by 28 and it still fits in memory. I do not plan to do this,
it's just possible.

One of my objectives is to get Windows 3.11 to run on FreeDOS 1.0.
Somehow I think an intelligent disassembler will come in handy
here :-)

The latest version is available here:
http://code.google.com/p/vm64dec/downloads/list?ts=1218806623&saved=1

willow
From: Rod Pemberton on
"Wolfgang Kern" <nowhere(a)never.at> wrote in message
news:g83f3f$226$1(a)newsreader2.utanet.at...
> "Rod Pemberton" mentioned:
> ....
> > ;These offsets should be the byte offset forms of mov...
> > mov al,0x33 ;mov al,0x33
> > mov ax,0x0033 ;mov ax,0x33
> > mov eax,0x00000033 ;mov eax,0x33
> > mov bl,0x33 ;mov bl,0x33
> > mov bx,0x0033 ;mov bx,0x33
> > mov ebx,0x00000033 ;mov ebx,0x33
>
> MOV (w/dw)reg,imm doesn't exist in extended byte versions, these
> are only available in the ADD..CMP group, IMUL and PUSH.
>

Oh, yes, you are correct. One is still wrong though... ;-) Ndisasm.exe
from NASM 2.03.01 has the problem of not displaying the proper sized offset
with 'mov'.


Rod Pemberton

From: Willow on
On Aug 14, 10:16 pm, "Rod Pemberton" <do_not_h...(a)nohavenot.cmm>
wrote:
[snip]
> ;Crudasm .07 (or 1.07?) 16-bit decoding on the left, Ndisasm 2.03.01 on the
> right after the semicolon. Use a fixed width font, i.e., notepad.
>
> o32 iretd ;iretd
> lar eax,ebx ;lar eax,bx
> ret word 0x0000 ;ret
>

Thank you for finding these bugs! It was "easy" to fix the script
file, although I rewrote the decoder part of the disassembler.
You should check out the latest version (see output below) - it fixes
all known issues that you pointed out.
If you have time, can you repeat the experiment on the latest version
and let me know how it goes?
Thanks a bunch!!!

Because of the rewrite I updated the program name to crudasm2 from
crudasm1, and also I included a utility that walks through code
similar to how a decompiler might approach things. See crudasm/
walker.cpp and the included walker executable.

Next thing to do is test 64-bit instructions, there will be some
unimplemented warnings but because of the rewrite it's going to be
easy to add 64-bit support now. Then after 64-bits comes fpu
instructions. Then I can see about running Windows 3.11 on top of
FreeDOS 1.0, this will require both dynamic (using bochs and the
crudasm core) and static analysis to find out what changes to FreeDOS
are needed to run Windows 3.11... I plan to make those changes...

The latest version is available here: http://code.google.com/p/vm64dec/downloads/list

--- What follows is the output now for the input you provided ---

00000100 iretd
00000102 lar eax,bx
00000106 ret
00000107 sldt eax
0000010b call dword far [0x33ff]
00000110 call far 0x3344:0x3344
00000115 call far 0x3344:0x3344
0000011a call far 0xaabb:0xaabb
0000011f call dword far 0x00003344:0x3344
00000127 call dword far 0xaabbccdd:0xccdd
0000012f jmp dword far [0xccdd]
00000134 jmp dword far [0xccdd]
00000139 jmp dword far 0xaabbccdd:0xccdd
00000141 jmp dword far [0xbbee]
00000146 rcl bl,cl
00000148 rcl bx,cl
0000014a rcl ebx,cl
0000014d rcr bl,cl
0000014f rcr bx,cl
00000151 rcr ebx,cl
00000154 rol bl,cl
00000156 rol bx,cl
00000158 rol ebx,cl
0000015b shl bl,cl
0000015d shl bx,cl
0000015f shl ebx,cl
00000162 sar bl,cl
00000164 sar bx,cl
00000166 sar ebx,cl
00000169 shr bl,cl
0000016b shr bx,cl
0000016d shr ebx,cl
00000170 shld word [0xbbee],ax,cl
00000175 shld ax,bx,cl
00000178 shld dword [0xbbee],eax,cl
0000017e shld eax,ebx,cl
00000182 shrd word [0xbbee],ax,cl
00000187 shrd ax,bx,cl
0000018a shrd dword [0xbbee],eax,cl
00000190 shrd eax,ebx,cl
00000194 xchg ax,bx
00000195 xchg eax,ebx
00000197 xchg ax,bx
00000198 xchg eax,ebx
0000019a xchg [0xbbee],al
0000019e xchg bl,al
000001a0 xchg [0xbbee],ax
000001a4 xchg ax,bx
000001a5 xchg [0xbbee],eax
000001aa xchg eax,ebx
000001ac xchg [0xbbee],al
000001b0 xchg bl,al
000001b2 xchg [0xbbee],ax
000001b6 xchg ax,bx
000001b7 xchg [0xbbee],eax
000001bc xchg eax,ebx
000001be cmpsb
000001bf cmpsd
000001c1 cmpsw
000001c2 insb
000001c3 insd
000001c5 insw
000001c6 lodsb
000001c7 lodsd
000001c9 lodsw
000001ca movsb
000001cb movsd
000001cd movsw
000001ce outsb
000001cf outsd
000001d1 outsw
000001d2 scasb
000001d3 scasd
000001d5 scasw
000001d6 stosb
000001d7 stosd
000001d9 stosw


Willow
From: NathanCBaker on
Your "pasta straightener" is interesting. RosAsm definitely needs
such a tool. :)

Nathan.