From: Rod Pemberton on
"BGB / cr88192" <cr88192(a)hotmail.com> wrote in message
news:hmbib7$tlt$1(a)news.albasani.net...
>
> sadly, I don't really know what actually "good" ASM macro systems do (mine
> is, essentially, just a hacked-over C preprocessor...).
>

Nor I. I really like NASM's macro processor. IIRC, it was derived from a C
preprocessor.

> note that the primary usage of my assembler is as a library for use within
> a (mostly) C app for dynamic code generation (typically JIT and
> dynamic thunk writing...). the reason I might want to use macros
> to gloss over arch issues
> is that, as is, I am ending up with lots of special-purpose code-writers
> which need to be tweaked for each target, and reducing the needed level of
> tweaking could be convinient...

You might inventory the functionality of a few assemblers: GAS, NASM, MASM,
etc. What is common among them is likely to be needed.

> the main limitation is that to be useful for my purposes, the overall
> performance overhead (of compilation/assembly) has to be kept fairly low

Hmm... That is a perpetual challenge, isn't it?

I think just about everyone here and on alt.os.development has run across
this problem. How does one keep performance high in assemblers, C
compilers, OS development tools that parse, while keeping the complexity,
size, and implementation time low.

> (invoking my C compiler is far too slow...), and it is also needed to be
> able to retain the level of control ASM offers for many of these thunks to
> work (many do fairly low-level/specialized tasks, often outside the reach
> of plain C...).

Ok. I suspect that'll add some complexity.

> so, I went and added a few features to the preprocessor for my assembler:
> "virtual headers", or the feature described before;
> scoping-levels and "local defines";
> multi-line macros can contain expansion-time directives;

Ok. That's way beyond what I thought you were describing.

> if and how this can gloss over x86 vs x86-64 differences, I don't know...

Conditionals... :)


Rod Pemberton




From: BGB / cr88192 on

"Rod Pemberton" <do_not_have(a)havenone.cmm> wrote in message
news:hmhaur$4lm$1(a)speranza.aioe.org...
> "BGB / cr88192" <cr88192(a)hotmail.com> wrote in message
> news:hmbib7$tlt$1(a)news.albasani.net...
>>
>> sadly, I don't really know what actually "good" ASM macro systems do
>> (mine
>> is, essentially, just a hacked-over C preprocessor...).
>>
>
> Nor I. I really like NASM's macro processor. IIRC, it was derived from a
> C
> preprocessor.
>

yeah...
some things in the preprocessor I am using were made a little more like the
one in NASM, such as allowing '%' to introduce directives, ...

however, at the level of things like macros, or some of my newer features,
they differ a bit...


>> note that the primary usage of my assembler is as a library for use
>> within
>> a (mostly) C app for dynamic code generation (typically JIT and
>> dynamic thunk writing...). the reason I might want to use macros
>> to gloss over arch issues
>> is that, as is, I am ending up with lots of special-purpose code-writers
>> which need to be tweaked for each target, and reducing the needed level
>> of
>> tweaking could be convinient...
>
> You might inventory the functionality of a few assemblers: GAS, NASM,
> MASM,
> etc. What is common among them is likely to be needed.
>

GAS, AFAIK, doesn't have a preprocessor.
NASM has one, I am aware of, and although mine differs some, I don't think I
am missing that much.

MASM, wasn't able to easily find much info, although I didn't search that
much either.


>> the main limitation is that to be useful for my purposes, the overall
>> performance overhead (of compilation/assembly) has to be kept fairly low
>
> Hmm... That is a perpetual challenge, isn't it?
>
> I think just about everyone here and on alt.os.development has run across
> this problem. How does one keep performance high in assemblers, C
> compilers, OS development tools that parse, while keeping the complexity,
> size, and implementation time low.
>

yeah.

well, mostly the problem is that for these uses I often end up needing to
generate code at the last moment, often in regards to handling a particular
piece of data or performing a specialized tasks...

if an assembler can deal with the problem in somewhere around 500ns to 1us,
this is much better than a C compiler taking 1ms to 50ms, or if one happens
to be using headers, easily 750ms-1500ms or more...

so, the C compiler is fast enough for scripts and shaders, but not really
for things like special-purpose code generation...


the assembler, in general, can't be reasonably made that much faster
(post-preprocessing, it fairly directly transcribes the text into machine
code, and in general the dynamic linker is also fairly fast).

both internally use hash-tables in a number of spots to speed lookups, ...
the current preprocessor, oddly, doesn't use hashes, I suspect because it is
older than my current C preprocessor (I suspect it was a partial rewrite of
the preprocessor from a prior version of my C frontend, and was then later
reused as the basis of another C preprocessor for another partly rewritten C
frontend...).

I guess I could add hashes here though, as otherwise there is a risk that
the preprocessor is eating cycles, but I haven't really tested it (not
exactly done tests involving assembling high volumes of ASM to see where
bottlenecks are...).


I have benchmarked the overal process when compiling C, and usually most of
the running time goes into parsing-related tasks (such as tokenizing), but
there is not much of a reasonable way to speed this up more.

one possible way, although it would involve a very different parser
structure, would be to split parsing and lexing, such that internally tokens
can be hashed into integers and I can use table-based logic in many places.

however, this is a bit of a price to pay in the name of parser speed.


>> (invoking my C compiler is far too slow...), and it is also needed to be
>> able to retain the level of control ASM offers for many of these thunks
>> to
>> work (many do fairly low-level/specialized tasks, often outside the reach
>> of plain C...).
>
> Ok. I suspect that'll add some complexity.
>

yep.

one example of this is:
"ok, here is a signature string, now generate a stub which will call this
here function pointer with these arguments supplied in a buffer and a 'this'
pointer supplied in another argument".

so, a common use is in the creation of thunks to implement the "apply"
operation, ...

other uses include patching from the dynamic linker into TLS
(thread-local-storage), internal thunks generated within the
Class/Instance-OO machinery, ...


>> so, I went and added a few features to the preprocessor for my assembler:
>> "virtual headers", or the feature described before;
>> scoping-levels and "local defines";
>> multi-line macros can contain expansion-time directives;
>
> Ok. That's way beyond what I thought you were describing.
>

yeah.

the idea behind virtual headers is mostly so that front-end code generators
can tell the assembler about commonly reused globs of macros, such that they
can generate a thunk which references them (rather than either needing an
external include file or needing to include all the macros as a
prologue...).


>> if and how this can gloss over x86 vs x86-64 differences, I don't know...
>
> Conditionals... :)
>

using "#ifdef" is something I have been doing for a while now, but I had
been thinking of the possibility of more "refined" functionality...


however, apart from introducing compiler-like functionality into the
preprocessor, no good way to pull this off comes to mind...


admitted, this is partly why I had considered the possiblity of allowing C
code to register callbacks into the preprocessor, mostly so that it "could"
allow for this sort of functionality to be added externally. but, as noted,
this would mean either exposing some of the preprocessor internals, or
introducing an API, neither of which is ideal.

ppenv->iface->LookupDef(ppenv, "foo");
or: basmppLookupDef(ppenv, "foo"); //if I make a proper API
or: BASM_PP_LookupDefine(ppenv, "foo"); //expose internals, some impl
cleanup
or: BASM_PP_LookupDefine("foo"); //expose internals, present impl (yes,
sadly, the preprocessor uses globals for now, whereas my C-frontends'
preprocessor uses a context structure...).


but, stuff like this would almost imply making the preprocessor its own
standalone component (rather than a piece of code copied here and there and
adapted some to the particular data being worked with), as well as probably
adding flags to indicate which features to enable or disable (a few things
are done for ASM which would not be valid with C, ...).

or, even if not its own standalone component, it is at least partly split
off from the assembler and given an API (so that it can be used by itself).

but, even then, adding too much could either make a mess (the preprocessor
is not exactly a clean piece of code), or risk compromising speed (and, as
well, a textual preprocessor seems an odd place to try to bolt on clearly
semantic features...).

so, it is all uncertain...


(much as is my current choice between Public Domain, MIT, or BSD licensing,
me having broken free of GPL now for all this and having little intention to
return...).


or such...