From: Adam Kellas on
On Mar 2, 2:23 pm, "Uri Guttman" <u...(a)StemSystems.com> wrote:
> you would have to post all the code and some data and hopefully that
> would exhibit the problem. calling brk() means you are doing serious
> data allocation which shouldn't happen in this style of code. maybe you
> are doing something else that you didn't tell us. s/// ops on a string
> may add some ram needs but not insane amounts. massive unnecessary ram
> needs are usually leaks, either a rare perl bug or some poorly designed
> data structure that leaks.

For the record, this rewrite seems to solve the problem:

sub varify {
my $text = shift;
$text =~ s%\$%\$\$%g;
for (keys %MakeVars) {
my $name = $MakeVars{$_};
$text =~ s%$_%$name%g;
}
return $text;
}

Here the hash keys are precompiled REs, e.g.

my $re = qr/\Q$text\E/;
$MakeVars{$re} = $macro_name;

Thanks for the advice,
AK
From: Uri Guttman on
>>>>> "AK" == Adam Kellas <adam.kellas(a)gmail.com> writes:

AK> On Mar 2, 2:23�pm, "Uri Guttman" <u...(a)StemSystems.com> wrote:
>> you would have to post all the code and some data and hopefully that
>> would exhibit the problem. calling brk() means you are doing serious
>> data allocation which shouldn't happen in this style of code. maybe you
>> are doing something else that you didn't tell us. s/// ops on a string
>> may add some ram needs but not insane amounts. massive unnecessary ram
>> needs are usually leaks, either a rare perl bug or some poorly designed
>> data structure that leaks.

AK> For the record, this rewrite seems to solve the problem:

AK> sub varify {
AK> my $text = shift;
AK> $text =~ s%\$%\$\$%g;

why the strange use of % for a delimiter? it is impossible to
read. actually one of the worst choices you can make IMO. besides you
don't need to do that as / is fine since you have no / in your regex.

AK> for (keys %MakeVars) {

use a named variable for loops. it is cleaner and easier to read. $_ can
be easily clobbered or some action at a distance can happen. my vars are
lexical to the loop and much safer

AK> my $name = $MakeVars{$_};
AK> $text =~ s%$_%$name%g;

again with the % delimiter. i would go blind reading that. and there is
no need for the temp $name as you can put that hash lookup directly in
the replacement string.

AK> }
AK> return $text;
AK> }

AK> Here the hash keys are precompiled REs, e.g.

AK> my $re = qr/\Q$text\E/;

you don't use % there, so why in the s/// ops?

AK> $MakeVars{$re} = $macro_name;

again, you don't need the temp var. and that will cause you to lose the
precompiled nature of qr. all keys in a hash are just plain strings (not
perl values). it may work anyhow but if you had something that needed to
be a real regex feature that broke when putting it in as a hash key, you
would be very sorry. why don't you make the hash based on the name
instead with the qr// as the value? it will always work correctly as a regex.

uri

--
Uri Guttman ------ uri(a)stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
From: J�rgen Exner on
Adam Kellas <adam.kellas(a)gmail.com> wrote:
>For the record, this rewrite seems to solve the problem:

As execution speed was the issue at hand ...

>sub varify {
> my $text = shift;
> $text =~ s%\$%\$\$%g;
> for (keys %MakeVars) {
> my $name = $MakeVars{$_};
> $text =~ s%$_%$name%g;

.... I would get rid of that unnecessary temporary variable $name and
thus save a string copy:
$text =~ s%$_%$MakeVars{$_}%g;

BTW: I find your % signs as s///-delimeters hard to read. Yes, they are
legal, yes there are occasions where using the standard slash is awkward
because you have to escape slashes that are part of the RE or the
replacement string. But none of those reasons exist here.

jue
From: Adam Kellas on
On Mar 2, 8:04 pm, "Uri Guttman" <u...(a)StemSystems.com> wrote:
> why the strange use of % for a delimiter? it is impossible to
> read. actually one of the worst choices you can make IMO. besides you
> don't need to do that as / is fine since you have no / in your regex.

Style ... aesthetics ... religion ... I decided years ago that my
personal style guide would be to always use m%% and s%%%. It may be
ugly but it's consistent; I can always find patterns in my code by
searching for m% or s%. Try that with //! The other problem with // is
that half the time there are slashes in the pattern, and often the
pattern changes in such a way that you either can or have to change
the delimiter. I've never regretted adding this convention to my style
guide, though I'm sure a case could be made for preferring some other
inert character like s,,,.

>   AK>     for (keys %MakeVars) {
>
> use a named variable for loops. it is cleaner and easier to read. $_ can
> be easily clobbered or some action at a distance can happen. my vars are
> lexical to the loop and much safer

Use of $_ was deliberate here because I gather it's marginally faster
and I was trying to tune the loop for speed. But in general I agree
that named variables are preferable.

>   AK>        my $name = $MakeVars{$_};
>   AK>        $text =~ s%$_%$name%g;
>
> again with the % delimiter. i would go blind reading that. and there is
> no need for the temp $name as you can put that hash lookup directly in
> the replacement string.

Style ... aesthetics ... religion ...

> again, you don't need the temp var. and that will cause you to lose the
> precompiled nature of qr. all keys in a hash are just plain strings (not
> perl values). it may work anyhow but if you had something that needed to
> be a real regex feature that broke when putting it in as a hash key, you
> would be very sorry. why don't you make the hash based on the name
> instead with the qr// as the value? it will always work correctly as a regex.

Thanks, I was not aware of that and will make changes as suggested.

AK
From: Adam Kellas on
On Mar 2, 8:36 pm, J rgen Exner <jurge...(a)hotmail.com> wrote:
> BTW: I find your % signs as s///-delimeters hard to read. Yes, they are
> legal, yes there are occasions where using the standard slash is awkward
> because you have to escape slashes that are part of the RE or the
> replacement string. But none of those reasons exist here.

As noted, this is a deliberate tradeoff. Yes, it requires an
additional keystroke (or 4 if you count the shift key) and may
reasonably be called ugly. But OTOH I've legislated leaning toothpick
syndrome out of existence, and I've never once had to escape a % in a
regular expression. YMMV.

AK