From: D Yuniskis on
Hi,

Does anyone have any rule-of-thumb guidelines for
the relative efficiencies of doing promotions in
the lexer vs the formal grammar? Both in terms of
time and space?

E.g., imagine:

[1-9][0-9]{2,4} return VALUE;

in the lexer vs. the same explicit definition in
the *grammar*? (i.e., using ". return (int) yylex[0]"
in the lexer)

Are there any easily identifiable criteria that affect
the tradeoff between the two approaches?

Is there any (significant) difference in lex/yacc
implementations wrt this issue (e.g., vs flex/bison)?

I can run some empirical tests but I am not expert enough
on either to know how to push the limits with test cases...

Thx,
--don
From: Peter Keller on
D Yuniskis <not.going.to.be(a)seen.com> wrote:
> Hi,
>
> Does anyone have any rule-of-thumb guidelines for
> the relative efficiencies of doing promotions in
> the lexer vs the formal grammar? Both in terms of
> time and space?
>
> E.g., imagine:
>
> [1-9][0-9]{2,4} return VALUE;
>
> in the lexer vs. the same explicit definition in
> the *grammar*? (i.e., using ". return (int) yylex[0]"
> in the lexer)
>
> Are there any easily identifiable criteria that affect
> the tradeoff between the two approaches?
>
> Is there any (significant) difference in lex/yacc
> implementations wrt this issue (e.g., vs flex/bison)?
>
> I can run some empirical tests but I am not expert enough
> on either to know how to push the limits with test cases...

Unless your compiler pass is compiling 24/7 and has to be real-time
like, you absolutely shouldn't care and should write your compiler
to be the most maintainable thing ever. Speed doesn't matter for 99%
of compilation passes.

Type promotion is done either in the type checking phase of the
compiler, after parsing, or in the code generator. It is better to
make type promotions explicit in a pass of your compiler, since them
you can warn when type promotions lose data (among other things). Of
course, all of the phases can be adhoc'ed together, but then you end
up with unmaintainable and fragile junk.

Well, that's my opinion anyway... :)

-pete

From: Walter Banks on


D Yuniskis wrote:
>
> Hi,
>
> Does anyone have any rule-of-thumb guidelines for
> the relative efficiencies of doing promotions in
> the lexer vs the formal grammar? Both in terms of
> time and space?
>
> E.g., imagine:
>
> [1-9][0-9]{2,4} return VALUE;
>
> in the lexer vs. the same explicit definition in
> the *grammar*? (i.e., using ". return (int) yylex[0]"
> in the lexer)
>
> Are there any easily identifiable criteria that affect
> the tradeoff between the two approaches?
>
> Is there any (significant) difference in lex/yacc
> implementations wrt this issue (e.g., vs flex/bison)?
>
> I can run some empirical tests but I am not expert enough
> on either to know how to push the limits with test cases...


Parsing is such a small part of a compiler that
I would trade time differences for clarity especially
for embedded systems compile once execute many
application environments.

Regards,


Walter..
--
Walter Banks
Byte Craft Limited
http://www.bytecraft.com

--- news://freenews.netfront.net/ - complaints: news(a)netfront.net ---
From: D Yuniskis on
Hi Peter,

Peter Keller wrote:
> D Yuniskis <not.going.to.be(a)seen.com> wrote:
>> Does anyone have any rule-of-thumb guidelines for
>> the relative efficiencies of doing promotions in
>> the lexer vs the formal grammar? Both in terms of
>> time and space?
>>
>> E.g., imagine:
>>
>> [1-9][0-9]{2,4} return VALUE;
>>
>> in the lexer vs. the same explicit definition in
>> the *grammar*? (i.e., using ". return (int) yylex[0]"
>> in the lexer)
>>
>> Are there any easily identifiable criteria that affect
>> the tradeoff between the two approaches?
>>
>> Is there any (significant) difference in lex/yacc
>> implementations wrt this issue (e.g., vs flex/bison)?
>>
>> I can run some empirical tests but I am not expert enough
>> on either to know how to push the limits with test cases...
>
> Unless your compiler pass is compiling 24/7 and has to be real-time
> like, you absolutely shouldn't care and should write your compiler
> to be the most maintainable thing ever. Speed doesn't matter for 99%
> of compilation passes.

(sigh) Sorry, I should have clarified. I'm not using them
to write a compiler but, rather, a parser for "messages".
While speed isn't a real issue (though I wouldn't want it to
be a *pig*!), I *am* concerned with size and stack penetration,
etc. "footprint"

As such, I am concerned with how often the machine "backtracks",
how much state it carries, etc.

> Type promotion is done either in the type checking phase of the
> compiler, after parsing, or in the code generator. It is better to
> make type promotions explicit in a pass of your compiler, since them
> you can warn when type promotions lose data (among other things). Of
> course, all of the phases can be adhoc'ed together, but then you end
> up with unmaintainable and fragile junk.
>
> Well, that's my opinion anyway... :)
From: D Yuniskis on
Hi Walter,

Walter Banks wrote:
>
> D Yuniskis wrote:
>
>> Does anyone have any rule-of-thumb guidelines for
>> the relative efficiencies of doing promotions in
>> the lexer vs the formal grammar? Both in terms of
>> time and space?
>>
>> E.g., imagine:
>>
>> [1-9][0-9]{2,4} return VALUE;
>>
>> in the lexer vs. the same explicit definition in
>> the *grammar*? (i.e., using ". return (int) yylex[0]"
>> in the lexer)
>>
>> Are there any easily identifiable criteria that affect
>> the tradeoff between the two approaches?
>>
>> Is there any (significant) difference in lex/yacc
>> implementations wrt this issue (e.g., vs flex/bison)?
>>
>> I can run some empirical tests but I am not expert enough
>> on either to know how to push the limits with test cases...
>
> Parsing is such a small part of a compiler that

Not part of a compiler -- please see my reply to Peter Keller's
post.

> I would trade time differences for clarity especially

The point of using a formal grammar instead of an ad-hoc
implementation is exactly that -- to make the formats
of the messages parsed very obvious. And, to make it easy for
others to write similar parsers just by cutting and pasting
from an existing one.

However, since it *does* sit in a run-time loop, I can't
*ignore* runtime costs (time *and* space).

> for embedded systems compile once execute many
> application environments.