From: Robert Klemme on
On 13.08.2010 00:50, Iain Barnett wrote:
>
> On 12 Aug 2010, at 19:52, Philipp Kempgen wrote:
>
>> Philipp Kempgen wrote:
>>> I'm looking for a parser which can be fed with some (A)BNF-style rules.
>> ...
>>> If so, which parser is more appropriate?
>>> It should hand back a parse tree.
>>
>> Alternatively I'd appreciate if Regexps could return *all*
>> occurrences of named capture groups inside repetitions etc.
>> instead of just the last match for each name. Feasible?

As I understand you want /f(o)+/ =~ "foo" to return ["o", "o"] as match
for the group (used normal capturing groups for simplicity).

> I was trying to do the same thing and asked about repeated named captures.
> http://osdir.com/ml/ruby-talk/2010-07/msg00361.html
>
> It seems that using String#scan is the closest anything Ruby has, as the Oniguruma regex engine doesn't support it. I think it's a real shame as named capture groups are really useful.

Regular expression engines generally do only return one match per group
- regardless of naming or not naming groups. I'm not up to date with
current Perl's regular expressions which are the only ones I can imagine
to be crazy enough to provide such a feature. :-) Otherwise String#scan
is indeed the proper tool to get multiple matches. The example above
could be done like this

if /f(o+)/ =~ s
$1.scan /o/ do |match|
p match
end
end

or this

if /f(o+)/ =~ s
matches = $1.scan /o/
p matches
end

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
From: Iain Barnett on

On 13 Aug 2010, at 11:15, Robert Klemme wrote:
>>
>
> Regular expression engines generally do only return one match per group - regardless of naming or not naming groups. I'm not up to date with current Perl's regular expressions which are the only ones I can imagine to be crazy enough to provide such a feature. :-)

Net definitely offers it, which is ironic as none of the .Net devs I used to work with ever used regex.

Scan is a really poor alternative (for this kind of task), I think. Instead of just writing one regex which expresses what I want to do much better, I end up having to write a lot of extra code to get the same effect. Scan's cool for more simple things.

Regards
Iain


From: Philipp Kempgen on
Robert Klemme wrote:
> On 13.08.2010 00:50, Iain Barnett wrote:
>> On 12 Aug 2010, at 19:52, Philipp Kempgen wrote:
>>
>>> Philipp Kempgen wrote:
>>>> I'm looking for a parser which can be fed with some (A)BNF-style rules.
>>> ...
>>>> It should hand back a parse tree.
>>>
>>> Alternatively I'd appreciate if Regexps could return *all*
>>> occurrences of named capture groups inside repetitions etc.
>>> instead of just the last match for each name. Feasible?
>
> As I understand you want /f(o)+/ =~ "foo" to return ["o", "o"] as match
> for the group (used normal capturing groups for simplicity).

FRUIT = '(?<FRUIT> Apple|Banana|Pear|Orange)'
FRUIT_COLLECTION = '(?<FRUIT_COLLECTION> ' << FRUIT << '*)'

re = Regexp.new( '^' << FRUIT_COLLECTION << '$', Regexp::EXTENDED )
re.match( 'PearBananaApple' )[:FRUIT]
=> "Apple"

I would want e.g. matchdata[:FRUIT] to be an Array
[ 'Pear', 'Banana', 'Apple' ]
and not just 'Apple' (the last FRUIT).

Actually something similar to a concrete syntax tree / parse tree
would be even better:

ROOT "PearBananaApple"
FRUIT_COLLECTION "PearBananaApple"
FRUIT "Pear"
FRUIT "Banana"
FRUIT "Apple"

>> the Oniguruma regex engine doesn't support it. I think it's a real shame as named capture groups are really useful.

Exactly.
Anyway thanks for your input.

Regards,
Philipp