status update: x86 interpreter... [ASM]

Prev: LPC2478 arm7tdmi 56mhz 8kb cache 32mb ram, jpeg decoding performance
Next: LPC2478 arm7tdmi 56mhz 8kb cache 32mb ram, jpeg decoding performance

From: Steve on 12 Nov 2009 08:04

"BGB / cr88192" <cr88192(a)hotmail.com> writes:
>
>technically, in my case the 'ASCII' versions are using UTF-8 (actually, it
>is the 'Modified UTF-8' scheme from the JVM, which is sort of the de-facto
>charset in my codebase), so the main issue would be to add code for doing
>string conversions to this library.

Hello,

Could you explain the modification, and its advantage?

Thanks

Steve N.

From: BGB / cr88192 on 12 Nov 2009 09:45

"Steve" <Bogus(a)Embarq.com> wrote in message news:hdh14b$1li$1(a)aioe.org...
> "BGB / cr88192" <cr88192(a)hotmail.com> writes:
>>
>>technically, in my case the 'ASCII' versions are using UTF-8 (actually, it
>>is the 'Modified UTF-8' scheme from the JVM, which is sort of the de-facto
>>charset in my codebase), so the main issue would be to add code for doing
>>string conversions to this library.
>
> Hello,
>
> Could you explain the modification, and its advantage?
>

main differences:
it is possible to encode literal 0 characters in a string as the bytes 0xC0
0x80 (although this is rarely used in practice, as it is problematic in
handling code...).

in standard UTF-8, characters >= 65536 are encoded directly;
in UTF-16, they are encoded via "surrogate pairs", which extend the space to
1M.
in Modified UTF-8, characters >=65536 are encoded as surrogate pairs encoded
as UTF-8.

the main advantage of M/UTF-8 in this cases is that it is much closer to a
1:1 mapping with UTF-16 (since they are converted back and forth simply by
converting the values, rather than having to understand the more subtle
rules of each).

however, as a cost, it implies that chars >1M are multiple characters, and
also may inflate the string some if many of these characters are used (for
example, because a character uses 6 bytes rather than 4, ...).

note that, technically the formats are relatively compatible, and most of my
handling code can deal with both formats fairly transparently.

> Thanks
>
> Steve N.

From: BGB / cr88192 on 12 Nov 2009 10:45

"BGB / cr88192" <cr88192(a)hotmail.com> wrote in message
news:hd9q87$6cp$1(a)news.albasani.net...
> seems I had not been telling people here about this, so, I just figured I
> would give a status update...
>

and now is another update...

>
> currently about 13 MIPS in tests (testing a simple loop doing lots of
> memory IO and bit-twiddling);
> in terms of time, the interpreter is about 76x slower than native (vs the
> same loop compiled in native code).
>

still about the same, but it is variable as "trivial" changes end up making
it faster or slower...

> the interpreter is currently pure C, and is not particularly
> micro-optimized, so this factor could still be improved some (ASM and/or
> JIT being options, but at the moment I am not considering them, since as I
> see it, having the thing work acceptably is more important than speed at
> present).
>

a few idle thoughts for how to approach JIT have occured though, and so I
may do this eventually.
current thinking is that a piece of logic would be connected between the
decoder and interpreter, which would scan forwards, and possibly replace a
group of instructions with a single 'pseudo-instruction' which would encode
this whole group.

the external behavior would be be about the same as the single instruction
case, only that the 'handler' function is actually a chunk of JITed code.

however, this is not an immediate priority.

JNI, still not fully implemented...

>
> I am currently also working some on POSIX functionality (within the
> virtualized world). I don't expect a "complete" implementation, but
> hopefully enough for my uses.
>
> currently supported POSIX features: basic file IO, dlopen/dlsym, ...
> still lacking: more advanced file features (stat, readdir, ...), sockets,
> pthreads, ...
>

readdir has been added.

stat has not (stat would actually have to be implemented within my VFS, as
my VFS does not presently include this feature, in the general sense).
sockets is similar to stat.

granted, I could try to map it directly to the OS level sockets (rather than
trying to add it to and route it through the VFS). this would require a
little fiddling to avoid breaking the POSIX-style sockets which alias files
and sockets.

pthreads, at present, would require adding a scheduler, and possibly moving
the interpreter logic into its own (OS-level) thread (may make this part
optional).

UID/GID/shell: no change.

>
> I am still using PE/COFF EXE's and DLL's (but may consider leaving off the
> '.EXE' extension, or allowing use of '.SO' for DLL's). little beyond
> inconvinience (having the means to compile code as ELF on Windows, or
> writing a loader) prevents using ELF. I "could" take a "flex" strategy and
> use whatever is more convinient (vs forcing Linux builds to use PE/COFF),
> although it can be noted that they are not strictly equivalent (and could
> also pose issues for dumb build tools).
>

I have now added a basic form of ASLR:
http://en.wikipedia.org/wiki/ASLR

at present, it mostly just jitters the base addresses of some structures
(the load base of DLLs, the stack top, ...). for the heap, it reserves a
small random-size space at the front (0-4kB), thus effecting the exact
positions of subsequent alignments.

I am currently trying to think up a way to do "generalized" address space
randomization without risking introducing too much fragmentation.

>
> for technical reasons, I may impose limits as to how much memory can be
> used by virtual processes (otherwise, I would need a "proper" MMU and
> support for swapping in order to work effectively on 32-bit hosts).
>

not yet addressed...

From: wolfgang kern on 12 Nov 2009 15:06

"BGB / cr88192" posted here and in AOD:

....

> and now is another update...

....

I'm still not sure that I'm able to follow/understand your ideas,
but anyway all new ideas are worth to think over at least ...

Yeah there could be a one and only interpretater for all CPU's
and OS's, but this would need to shrink all sourcecode down to a
very limited (yours? or whoevers?) functionality.

In general I like this idea, but we easy will find us apart when
it comes to speed and size related to performance.

My solution for OS (version-)independent coding of addons/applications
is an OS-related script-language. So whenever I upgrade my Os, the
already sold applications may gain speed, but nothing else ...

Sure there might be new features which allow other methodes,
but the customers decide to use it or keep the old method.

Please give us an answer (perhaps first to yourself) what kind of
programming style you have in mind here [fast?/smart?/short?/easy?].

a commom saying:
We can't get everything, and for sure not all at the very same time!

my personal preference [very well accepted by my clients] is
[fast/smart/short and transparent], and as a matter of fact:
my clients never care (never had to care) about "easy to programm".
__
wolfgang

From: BGB / cr88192 on 12 Nov 2009 17:49

"wolfgang kern" <nowhere(a)never.at> wrote in message
news:hdhps3$2iq$1(a)newsreader2.utanet.at...
>
> "BGB / cr88192" posted here and in AOD:
>
> ...
>
>> and now is another update...
>
> ...
>
> I'm still not sure that I'm able to follow/understand your ideas,
> but anyway all new ideas are worth to think over at least ...
>
> Yeah there could be a one and only interpretater for all CPU's
> and OS's, but this would need to shrink all sourcecode down to a
> very limited (yours? or whoevers?) functionality.
>

this interpreter is not intended to replace native code...

it is no more intended to replace the existence of native code and OS's than
DOSBox is to replace Windows...

Windows serves one role, DOSBox another, and my project would serve a
different role than either...

the goal is also different from that of the JVM or .NET VM (which aim to
replace one world with another...).

the role then is to do what my prior compiler has aimed to do:
to allow application extensions and scripts.

they then differ in the types of extensions and the means of implementing
them, where the compiler had implemented extensions in the form of code
running directly in the host address space, the interpreter will run them
essentially in sandboxes (hence the idea of using a VFS as opposed to
directly using the host filesystem, ...).

in particular, I am considering the possibility of "untrusted" extensions.

similarly, there are cases where pre-compiled DLLs may be a convinient means
of distributing code (plugins, scripts, ...), ...

> In general I like this idea, but we easy will find us apart when
> it comes to speed and size related to performance.
>

if I don't replace the native OS, it is not as much of a worry.
scripts are slower than native, granted, but I will assume here that most
performance critical parts of an app are likely to be written in native
code, and so the role of scripts is not to implement an entire app's
functionality.

the app will instead plug some of the interpreter's functionality into its
own provided backends (or, as is, they default to trying to use my compiler
framework's facilities).

> My solution for OS (version-)independent coding of addons/applications
> is an OS-related script-language. So whenever I upgrade my Os, the
> already sold applications may gain speed, but nothing else ...
>
> Sure there might be new features which allow other methodes,
> but the customers decide to use it or keep the old method.
>
> Please give us an answer (perhaps first to yourself) what kind of
> programming style you have in mind here [fast?/smart?/short?/easy?].
>

I don't know...
I just figured people would use C for both the host app, and for any
extensions.

so, the idea is to hopefully allow code to be moved fairly easily and
transparently between native-land and the VM world.

it also helps here if many common facilities are available in both cases,
and if many of the facilities existing in the interpreted world will be
familiar (however, virtualized and potentially sandboxed).

I would also like it if most facilities can be available at similar "levels
of abstraction".
using a language like Java and C# should not require a traditional VM, as
the VM should be optional (for example, we could compile the Java directly
to native code, ...).

ones' choice to use C should not prevent them from being able to use things
such as operating within a virtualized or sandboxed environment if needed,
or being able to use "eval", ...

C and x86 do not require retooling, and they don't require a fundamental
change in coding practices or mindset.

we already have a VM with a long and proven track record: x86...

> a commom saying:
> We can't get everything, and for sure not all at the very same time!
>
> my personal preference [very well accepted by my clients] is
> [fast/smart/short and transparent], and as a matter of fact:
> my clients never care (never had to care) about "easy to programm".

ok.

> __
> wolfgang
>
>

First | Prev |
Pages: 1 2 3
Prev: LPC2478 arm7tdmi 56mhz 8kb cache 32mb ram, jpeg decoding performance
Next: LPC2478 arm7tdmi 56mhz 8kb cache 32mb ram, jpeg decoding performance