a bytecode idea spec... [General Programming]

Prev: need ur help for my Masters project(TASM project)
Next: ANN: Seed7 Release 2010-07-04

From: Jacko on 3 Jul 2010 18:45

Just looked.

My VM spec's at http://acebforth.googlecode.com

Just 23 opcodes.

Cheers Jacko

From: BGB / cr88192 on 4 Jul 2010 12:16

"Jacko" <jackokring(a)gmail.com> wrote in message
news:974bdba6-f158-4ab5-85eb-72ef53e745c4(a)k39g2000yqd.googlegroups.com...
> Just looked.
>
> My VM spec's at http://acebforth.googlecode.com
>
> Just 23 opcodes.
>

some of my main bytecodes tend to include a bit more...

but, OTOH, handling statically typed C-family languages tends to result in a
lot of opcodes.

checking:
RPNIL has 227 opcodes, but spread over a number range of around 1136 (the
space is a little larger as old opcodes may have been removed, some areas of
free-space are left for organization reasons, ...).

another (interpreter) has 204, spread over a space of 294.

but, yeah, the nifty point would be having a bytecode with a non-fixed
opcode assignment, mostly so that it can be used with different interpreters
or JIT machinery, without me having to force them all into using exactly the
same numbering, ...

granted, if one wants code to work between interpreters, basic opcode names
and semantics would need to be defined, such as "dup" or "swap"/"exch", ...
only, it will not matter where they are in the opcode table, as this matter
can be left up to the interpreter or JIT.

it also avoids requiring me to share opcode tables between different
libraries, which IMO goes against modularity, while at the same time not
requiring textual serialization.

From: Mike Austin on 4 Jul 2010 18:28

BGB / cr88192 wrote:
> "Jacko" <jackokring(a)gmail.com> wrote in message
> news:974bdba6-f158-4ab5-85eb-72ef53e745c4(a)k39g2000yqd.googlegroups.com...
>> Just looked.
>>
>> My VM spec's at http://acebforth.googlecode.com
>>
>> Just 23 opcodes.
>>

A Tree-Based Alternative to Java Byte-Codes
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.26.2124&rep=rep1&type=pdf

0 opcodes :)

But on a more serious note, a tree can be plenty fast when implemented
efficiently. It doesn't physically have to be a tree.

Mike

> some of my main bytecodes tend to include a bit more...
>
> but, OTOH, handling statically typed C-family languages tends to result in a
> lot of opcodes.
>
>
> checking:
> RPNIL has 227 opcodes, but spread over a number range of around 1136 (the
> space is a little larger as old opcodes may have been removed, some areas of
> free-space are left for organization reasons, ...).
>
> another (interpreter) has 204, spread over a space of 294.
>
>
> but, yeah, the nifty point would be having a bytecode with a non-fixed
> opcode assignment, mostly so that it can be used with different interpreters
> or JIT machinery, without me having to force them all into using exactly the
> same numbering, ...
>
>
> granted, if one wants code to work between interpreters, basic opcode names
> and semantics would need to be defined, such as "dup" or "swap"/"exch", ...
> only, it will not matter where they are in the opcode table, as this matter
> can be left up to the interpreter or JIT.
>
> it also avoids requiring me to share opcode tables between different
> libraries, which IMO goes against modularity, while at the same time not
> requiring textual serialization.
>
>
>

From: Jacko on 4 Jul 2010 21:10

On 4 July, 23:28, Mike Austin <m...(a)mike-nospam-austin.com> wrote:
> BGB / cr88192 wrote:
> > "Jacko" <jackokr...(a)gmail.com> wrote in message
> >news:974bdba6-f158-4ab5-85eb-72ef53e745c4(a)k39g2000yqd.googlegroups.com....
> >> Just looked.
>
> >> My VM spec's athttp://acebforth.googlecode.com
>
> >> Just 23 opcodes.
>
> A Tree-Based Alternative to Java Byte-Codeshttp://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.26.2124&rep=...
>
> 0 opcodes :)
>
> But on a more serious note, a tree can be plenty fast when implemented
> efficiently. It doesn't physically have to be a tree.
>
> Mike
>
>
>
> > some of my main bytecodes tend to include a bit more...
>
> > but, OTOH, handling statically typed C-family languages tends to result in a
> > lot of opcodes.
>
> > checking:
> > RPNIL has 227 opcodes, but spread over a number range of around 1136 (the
> > space is a little larger as old opcodes may have been removed, some areas of
> > free-space are left for organization reasons, ...).
>
> > another (interpreter) has 204, spread over a space of 294.
>
> > but, yeah, the nifty point would be having a bytecode with a non-fixed
> > opcode assignment, mostly so that it can be used with different interpreters
> > or JIT machinery, without me having to force them all into using exactly the
> > same numbering, ...
>
> > granted, if one wants code to work between interpreters, basic opcode names
> > and semantics would need to be defined, such as "dup" or "swap"/"exch", ...
> > only, it will not matter where they are in the opcode table, as this matter
> > can be left up to the interpreter or JIT.
>
> > it also avoids requiring me to share opcode tables between different
> > libraries, which IMO goes against modularity, while at the same time not
> > requiring textual serialization.- Hide quoted text -
>
> - Show quoted text -

Nice paper, could have more details in though about tree organization
as some bit patterns. But I just uploaded my latest optimization and
fixes to (COMP) instruction. Still just 23 opcodes, the (COMP)
instruction has to be the most complicated. It allows for compilier
subroutines to be created, while maintaining an easy method of
decompilation. In effect inlining the compiler within the runtime.
Part of the incremental compilation nature of FORTH. Or another way
would be to say interactive testing and editing, while storing the
text in compiled form for later edits. This is the only language
specific opcode.

The automatic subroutine return from an opcode seems wasteful in the
context of inlining opcodes with threading addresses, but to prefix a
zero identifier to each opcode and have it as it's own subroutine with
an address, does allow bootstrapping, and a simple zero test for
opcode entry. This is quicker than a double indirection of indirect
threading. This is due to caching of each indirect code field pointer
(being the number of used routines), instead of compare #0 branch
#prim being a few bytes. Subroutine threading roughly would double I
cache usage, and so would be slowest.

This is why languages such as FORTH can be faster than expected.

Cheers Jacko

From: Jacko on 4 Jul 2010 21:25

And that empty slot in interface definitions is for pointing to the
start of the ordered interface function pointer set. All method
pointer tables one per class, so invokeX is lots of pointer
dereferencing. Each bytecode should be replaced/mirrored (1->4 byte
mapping of array) by a pointer to a function. Then the whole Java
thing is direct threaded code. Many bytecodes may map to the same
function, such as many of the return bytecodes.

Tableswitches should be optimized possibly by range arrays of function/
jump pointers. A tree of byte indexed arrays (at most) should make for
a fastish switch (max 4 pointer lookups) for unsightly mis-
enumerations of switched items.

Cheers Jacko

| Next | Last
Pages: 1 2 3
Prev: need ur help for my Masters project(TASM project)
Next: ANN: Seed7 Release 2010-07-04