From: Peng Yu on
I try to match .cpp files that doesn't start with 'main'. But the grep
command below match all the .cpp files. I know that \w* tries to match
as long as possible. Do you know how to fix the regex to get all
the .cpp files that doesn't start with 'main'?

#!/usr/bin/env perl

use strict;
use warnings;

my @array=qw(main.cpp main_xx.cpp uuu.cpp vvv.cpp);

my @non_main_cpp=grep /(?<!main)\w*.cpp/, @array;
print join(', ', @non_main_cpp), "\n";
From: Dr.Ruud on
Peng Yu wrote:

> I try to match .cpp files that doesn't start with 'main'. But the grep
> command below match all the .cpp files. I know that \w* tries to match
> as long as possible. Do you know how to fix the regex to get all
> the .cpp files that doesn't start with 'main'?
>
> #!/usr/bin/env perl
>
> use strict;
> use warnings;
>
> my @array=qw(main.cpp main_xx.cpp uuu.cpp vvv.cpp);
>
> my @non_main_cpp=grep /(?<!main)\w*.cpp/, @array;
> print join(', ', @non_main_cpp), "\n";

There are many ways to do this, it all depends on what you need in the end:

perl -wle '
my @filenames = qw( 1.txt main.txt 1.cpp main_01.cpp );
print for "", grep !/^main/, @filenames;
print for "", grep !/\Amain/ && /\.cpp\z/, @filenames;
print for "", grep {!/\Amain/ and /\.cpp\z/} @filenames;
print for "", grep +(!/\Amain/ and /\.cpp\z/), @filenames;
'

1.txt
1.cpp

1.cpp

1.cpp

1.cpp


--
Ruud
From: sln on
On Sun, 13 Jun 2010 05:20:42 -0700 (PDT), Peng Yu <pengyu.ut(a)gmail.com> wrote:

>I try to match .cpp files that doesn't start with 'main'. But the grep
>command below match all the .cpp files. I know that \w* tries to match
>as long as possible. Do you know how to fix the regex to get all
>the .cpp files that doesn't start with 'main'?
>
>#!/usr/bin/env perl
>
>use strict;
>use warnings;
>
>my @array=qw(main.cpp main_xx.cpp uuu.cpp vvv.cpp);
>
>my @non_main_cpp=grep /(?<!main)\w*.cpp/, @array;
>print join(', ', @non_main_cpp), "\n";

The form was close but it won't work this way.
This concept is hard to grasp.

You can do it 1 of 2 ways:

With a negative look behind:
@non_main_cpp = grep /^(?:\w(?<!main))*\.cpp$/, @array;

This is what you tried to do.
Looking behind must be "visualized" as if YOU were the
current character as you traverse the string.
Each \w that is found in the accumulating match
must be immediatly tested that ^main isin't behind us.
Using ^ (?: \w (?<!main) )* on "main_xx.cpp" we see the
match progression:

main_xx.cpp
^ at the beginning, no ^main behind us
m^ still ok
ma^ ok
mai^ ok
main^ failed, ^main is behind us

Yours didn't work because the look behind was done
before the first \w was found. It then wen't on to
find all the \w* without even checkin the assertion.

Or, with a negative look ahead:
@non_main_cpp = grep /^(?!main)\w*\.cpp$/, @array;

This is a look ahead. As usual the object were looking
ahead of is to the left of the assertion.
In this case, its the begining of the line ^, before \w*.
The check is done once. If success, \w* will try to match,
but the assertion is never checked.

main_xx.cpp
^ at the beginning, failed ^main is ahead of us

For your circumstances, this is the preferred method.

----

What helps when you write regular expressions is to "be" the
character as you traverse the string. Make it a personal exercise.
I am a 9, I don't want 'a' or 'b' next to me, I want a space or
digit, I need this done five times from the beginning with
only the end of string in front of "us".
Yeah well, something like that..

-sln
From: C.DeRykus on
On Jun 13, 5:20 am, Peng Yu <pengyu...(a)gmail.com> wrote:
> I try to match .cpp files that doesn't start with 'main'. But the grep
> command below match all the .cpp files. I know that \w* tries to match
> as long as possible. Do you know how to fix the regex to get all
> the .cpp files that doesn't start with 'main'?
>
> #!/usr/bin/env perl
>
> use strict;
> use warnings;
>
> my @array=qw(main.cpp main_xx.cpp uuu.cpp vvv.cpp);
>
> my @non_main_cpp=grep /(?<!main)\w*.cpp/, @array;
> print join(', ', @non_main_cpp), "\n";


Another option using /p and the ${^PREMATCH} variable:

my @non_main_cpp =
grep {/\.cpp$/p and ${^PREMATCH} !~ /^main/ }
@array;

--
Charles DeRykus