From: TonyV on
Hey all, I've been trying to hammer away at this, and I just can't
figure it out. I'm hoping a regular expressions guru can help me out.

I'm trying to parse a retrieved javascript file to extract the
parameters out of a function call. Here's a contrived line that
represents what will be fetched:

foo('parameter 1', 'param with \'single\' quotes', 'param with\"double
\" quotes', 'this param, it has a comma', 'five');

The goal is to get an array with these elements:
parameter 1
param with 'single' quotes
param with "double" quotes
this param, it has a comma
five

There will always be five parameters, and the function name will
always be foo. Normally, I'm handy with regexes, but damn, those
escaped quotes and commas are killing me, and the data does have lots
of them in there.

I'm not lazy, I've been plugging away at this trying to work with look-
behind reference, greedy matching, and so on, but I'm just at an
impasse and can't extract what I want out of it. I've googled various
regex cookbooks (even have access to O'Reilly's Safari), but I've come
up with bupkiss.

Any ideas? I'd surely appreciate any help!
--TonyV
From: Uri Guttman on
>>>>> "T" == TonyV <kingskippus(a)gmail.com> writes:

T> foo('parameter 1', 'param with \'single\' quotes', 'param with\"double
T> \" quotes', 'this param, it has a comma', 'five');

T> The goal is to get an array with these elements:
T> parameter 1
T> param with 'single' quotes
T> param with "double" quotes
T> this param, it has a comma
T> five

T> Any ideas? I'd surely appreciate any help!

text::balanced should be able to do that easily. it can parse matched
parens, quotes and other top level tokenizing syntax.

uri

--
Uri Guttman ------ uri(a)stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
From: Ben Bullock on
On Apr 22, 1:39 pm, TonyV <kingskip...(a)gmail.com> wrote:
> I'm trying to parse a retrieved javascript file to extract the
> parameters out of a function call. Here's a contrived line that
> represents what will be fetched:
>
> foo('parameter 1', 'param with \'single\' quotes', 'param with\"double
> \" quotes', 'this param, it has a comma', 'five');
>
> The goal is to get an array with these elements:
> parameter 1
> param with 'single' quotes
> param with "double" quotes
> this param, it has a comma
> five

#! perl
use warnings;
use strict;
my $parameter = qr/'(?:[^']|\\')+'/;
my $test = q/foo('parameter 1', 'param with \'single\' quotes', 'param
with\"double\" quotes', 'this param, it has a comma', 'five')/;
if ($test =~ /foo\s*\(\s*($parameter)\s*,\s*($parameter)\s*,
\s*($parameter)\s*,\s*($parameter)\s*,\s*($parameter)\s*\)/s) {
print "Matched.\n";
print "$1\n$2\n$3\n$4\n$5\n";
}

You could also use

/foo\s*\(\s*(?:$parameter\s*,\s*){4}($parameter)\s*\)/

if you don't need the parameter values right away (e.g. match for them
using another regex later on). That would make the code tidier.

> There will always be five parameters, and the function name will
> always be foo.

Are the parameters necessarily single quoted?
From: J�rgen Exner on
TonyV <kingskippus(a)gmail.com> wrote:
>I'm trying to parse a retrieved javascript file to extract the
>parameters out of a function call. Here's a contrived line that
>represents what will be fetched:
>
>foo('parameter 1', 'param with \'single\' quotes', 'param with\"double
>\" quotes', 'this param, it has a comma', 'five');
>
>The goal is to get an array with these elements:
>parameter 1
>param with 'single' quotes
>param with "double" quotes
>this param, it has a comma
>five

I think Text::CSV::parse() should do the job just fine.

jue
From: bugbear on
TonyV wrote:
>
> I'm not lazy, I've been plugging away at this trying to work with look-
> behind reference, greedy matching, and so on, but I'm just at an
> impasse and can't extract what I want out of it. I've googled various
> regex cookbooks (even have access to O'Reilly's Safari), but I've come
> up with bupkiss.

IIRC it is *impossible* to fully implement nested matching quotes
with a regexp.

Ah! (google). This sounds helpful:

http://evolt.org/RegEx_Basics#comment-60762

BugBear