From: C.DeRykus on
On Jan 22, 9:54 am, kj <no.em...(a)please.post> wrote:
> Is there a way to tell if a given Regexp object, generated at
> runtime, includes at least one pair of capture parentheses?
> ...

If you have a recently current Perl version, you might be able
to leverage re::regexp_pattern in list context to check paren's.

On a Win32 5.10.1 strawberry distro for instance:

c:\strawberry\perl\bin\perl.exe -le "
use re 'regexp_pattern';
$r = qr/ab(\d+)/;
($pat) = regexp_pattern($r);
print $pat"
ab(\d+)

So you could parse $pat for capturing paren's. You'd need to
exclude certain assertions such as (? ... ) but that's left
as an exercise for the reader :)

--
Charles DeRykus
From: sln on
On Fri, 22 Jan 2010 17:54:49 +0000 (UTC), kj <no.email(a)please.post> wrote:

>Is there a way to tell if a given Regexp object, generated at
>runtime, includes at least one pair of capture parentheses?
>
>More generally, is there any documentation for the Regexp class?
>(I'm referring to the class alluded to by the output of, e.g., ref
>qr//). Running perldoc Regexp fails ("no docs found"), and perldoc
>perlre does not say much at all about this class as such.
>
>TIA!
>
>Kynn

Its not too hard to analyse the string returned by qr//
to get the start (and thereby the count) of capture groups.
To get the actual group text requires some recursion and thought.

use strict;
use warnings;

my $tmp = qr/\(\$th (i(s))(i(s))(i(s))(?:(i\(s)\)(i(s))(i(s))\))/x;
my @capt;

while ($tmp =~ /( (?<!\\)\((?!\?) )/xg ) {
push @capt, pos($tmp);
}
print "$tmp\n";
my ($i,$last) = (1,1);

for my $p (@capt) {
print (' 'x ($p - $last), $i++ % 10);
$last = $p+1;
}
print "\nFound ",scalar @capt, " capture groups\n";

__END__

(?x-ism:\(\$th (i(s))(i(s))(i(s))(?:(i\(s)\)(i(s))(i(s))\)))
1 2 3 4 5 6 7 8 9 0 1
Found 11 capture groups

From: kj on
In <0t1ll5t3n1aaohh10q3isr9kgglg7ugfn8(a)4ax.com> sln(a)netherlands.com writes:

>On Fri, 22 Jan 2010 17:54:49 +0000 (UTC), kj <no.email(a)please.post> wrote:

>>Is there a way to tell if a given Regexp object, generated at
>>runtime, includes at least one pair of capture parentheses?
>>
>>More generally, is there any documentation for the Regexp class?
>>(I'm referring to the class alluded to by the output of, e.g., ref
>>qr//). Running perldoc Regexp fails ("no docs found"), and perldoc
>>perlre does not say much at all about this class as such.
>>
>>TIA!
>>
>>Kynn

>Its not too hard to analyse the string returned by qr//
>to get the start (and thereby the count) of capture groups.
>To get the actual group text requires some recursion and thought.

> use strict;
> use warnings;

> my $tmp = qr/\(\$th (i(s))(i(s))(i(s))(?:(i\(s)\)(i(s))(i(s))\))/x;
> my @capt;

> while ($tmp =~ /( (?<!\\)\((?!\?) )/xg ) {
> push @capt, pos($tmp);
> }
> print "$tmp\n";
> my ($i,$last) = (1,1);

> for my $p (@capt) {
> print (' 'x ($p - $last), $i++ % 10);
> $last = $p+1;
> }
> print "\nFound ",scalar @capt, " capture groups\n";
>
>__END__

>(?x-ism:\(\$th (i(s))(i(s))(i(s))(?:(i\(s)\)(i(s))(i(s))\)))
> 1 2 3 4 5 6 7 8 9 0 1
>Found 11 capture groups


Thanks for this code! Now I must study it.

~K

From: Martijn Lievaart on
On Fri, 22 Jan 2010 21:29:30 -0800, sln wrote:

> Its not too hard to analyse the string returned by qr// to get the start
> (and thereby the count) of capture groups. To get the actual group text
> requires some recursion and thought.
>
> use strict;
> use warnings;
>
> my $tmp = qr/\(\$th (i(s))(i(s))(i(s))(?:(i\(s)\)(i(s))(i(s))\))/x; my
> @capt;
>
> while ($tmp =~ /( (?<!\\)\((?!\?) )/xg ) {
> push @capt, pos($tmp);
> }
> print "$tmp\n";
> my ($i,$last) = (1,1);
>
> for my $p (@capt) {
> print (' 'x ($p - $last), $i++ % 10); $last = $p+1;
> }
> print "\nFound ",scalar @capt, " capture groups\n";
>
> __END__
>
> (?x-ism:\(\$th (i(s))(i(s))(i(s))(?:(i\(s)\)(i(s))(i(s))\)))
> 1 2 3 4 5 6 7 8 9 0 1
> Found 11 capture groups

I think this will fail on the regexp /\\(.)/.

M4
From: sln on
On Sun, 24 Jan 2010 07:42:40 +0100, Martijn Lievaart <m(a)rtij.nl.invlalid> wrote:

>On Fri, 22 Jan 2010 21:29:30 -0800, sln wrote:
>
>> Its not too hard to analyse the string returned by qr// to get the start
>> (and thereby the count) of capture groups. To get the actual group text
>> requires some recursion and thought.
>>
>> use strict;
>> use warnings;
>>
>> my $tmp = qr/\(\$th (i(s))(i(s))(i(s))(?:(i\(s)\)(i(s))(i(s))\))/x; my
>> @capt;
>>
>> while ($tmp =~ /( (?<!\\)\((?!\?) )/xg ) {
>> push @capt, pos($tmp);
>> }
>> print "$tmp\n";
>> my ($i,$last) = (1,1);
>>
>> for my $p (@capt) {
>> print (' 'x ($p - $last), $i++ % 10); $last = $p+1;
>> }
>> print "\nFound ",scalar @capt, " capture groups\n";
>>
>> __END__
>>
>> (?x-ism:\(\$th (i(s))(i(s))(i(s))(?:(i\(s)\)(i(s))(i(s))\)))
>> 1 2 3 4 5 6 7 8 9 0 1
>> Found 11 capture groups
>
>I think this will fail on the regexp /\\(.)/.
>
>M4

Correct. Inserting (?:\\.)* should fix it.
See if this will fail on anything.

-sln

use strict;
use warnings;

my $tmp = qr/\(\$th(\\(?:.) \\(.\) \\\(.)(i(s))(i(s))(?:(i\(s)\)(i(s))(i(s))\)))/x;
my @capt;

# /(?<!\\)(?:\\.)*\((?!\?)/
# -------------------------
my $grprx = qr/
(?<!\\) # Not an escape behind us
(?:\\.)* # 0 or more escape + any char
\( # (
(?!\?) # Not a ? in front of us
/x;

while ($tmp =~ /($grprx)/g ) {
# print "'$1'\n";
push @capt, pos($tmp);
}
print "$tmp\n";
my ($i,$last) = (1,1);

for my $p (@capt) {
print (' 'x ($p - $last), $i++ % 10);
$last = $p+1;
}
print "\nFound ",scalar @capt, " capture groups\n";

__END__

(?x-ism:\(\$th(\\(?:.) \\(.\) \\\(.)(i(s))(i(s))(?:(i\(s)\)(i(s))(i(s))\))))
1 2 3 4 5 6 7 8 9 0 1
Found 11 capture groups