I think an assertion might do this? [Perl]

Prev: looking for Hamming+BCH+Reed - Solomon Codes in perl
Next: FAQ 4.52 How do I sort an array by (anything)?

From: C.DeRykus on 4 Aug 2010 14:48

On Aug 4, 6:02 am, Mr P <misterp...(a)gmail.com> wrote:
> Hey Perlistas...
>
> I wanted to replace any and all multiple sequential EOLs with a single
> one. This worked nicely:
>
> (a) s/(\n)\n+/$1/g;
>
> But thinking abot it some more. it occurred to me that these wouldn't
> quite work
>
> (b) s/(\n)\n/$1/g;
> (c) s/\n(\n)/$1/g;
> (d) s/\n\n/\n/g;
>
> because even though they are greedy, they don't retrace. And
> predictably they didn't work.

The /g switch applies the substitution globally
but greedy applies only to quantifiers such as *,
+ or {m,n}, etc.

The reason /g fails is that the engine matches and
substitutes consecutive pairs of \n's but, a final
trailing \n in any odd number of consecutive \n's
doesn't get replaced.

>
> I've never really used assertions, at least not enough to be familiar
> with them. I guess this would be a positive look-behind to force the
> engine to retrace?

It's easier IMO to think of this with the positive
look-ahead assertion solution that was shown. A
positive look-behind would work too though:

s/ (?<=\n) \n//gx;

>
> Can someone offer an example please where I can use something like s/
> (\n)\n/$1/g; with an assertion to do what (a) does? It would be
> instructional for me to see this example.
>

No, the look-ahead/behind assertions are zero length
so with something such as you've shown plus a look-
ahead/behind assertion, you'd be capturing and then
just replacing only the capture. The target wouldn't
be changed at all.

--
Charles DeRykus

From: Mr P on 4 Aug 2010 16:08

On Aug 4, 9:59 am, Tad McClellan <ta...(a)seesig.invalid> wrote:
> Mr P <misterp...(a)gmail.com> wrote:
> > I wanted to replace any and all multiple sequential EOLs with a single
> > one. This worked nicely:
>
> > (a) s/(\n)\n+/$1/g;
>
> You don't need regular expressions to do that:
>
> tr/\n/\n/s;
> or
> tr/\n//s;
>
That's quite a surprising solution to me as I would not have thought
of tr// - I will monkey with it thanks.. I always thought of tr// as a
1:1 mapping which is not what this is, so it seems un-natrual.

>
> I don't see how assertions can help with this problem...

It seemed like a natural solution to me.

start with /n/n/n
s/(\n)\n/$1/ now you have \n\n.
Which would MATCH again, IF the engine started at the scalar
beginning (hence my use of the word RETRACE, and my thought that this
was a LOOK-Behind case)..
>

As far as greed and quantifiers, requiring quantifiers makes no sense.
If I dont have the /g switch, the regex operates ONE TIME on the
scalar. If /g (regardless of quantifiers) it operates on the entire
scalar, as many times as it matches. And if it reset to the beginning
of the scalar AFTER each match, it would even work the same way as s/\n
\n*/\n/;

>
THanks.
> --
> Tad McClellan
> email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
> The above message is a Usenet post.
> I don't recall having given anyone permission to use it on a Web site.

From: Uri Guttman on 4 Aug 2010 17:42

>>>>> "P" == P <misterperl(a)gmail.com> writes:

P> On Aug 4, 9:59�am, Tad McClellan <ta...(a)seesig.invalid> wrote:
>> Mr P <misterp...(a)gmail.com> wrote:
>> > I wanted to replace any and all multiple sequential EOLs with a single
>> > one. This worked nicely:
>>
>> > (a) �s/(\n)\n+/$1/g;
>>
>> You don't need regular expressions to do that:
>>
>> � � tr/\n/\n/s;
>> or
>> � � tr/\n//s;
>>
P> That's quite a surprising solution to me as I would not have thought
P> of tr// - I will monkey with it thanks.. I always thought of tr// as a
P> 1:1 mapping which is not what this is, so it seems un-natrual.

no, it seems very natural if you rtfm! it can also delete chars with the
/d option. both /s and /d are very useful and much faster than the
equivilent s/// ops.

P> As far as greed and quantifiers, requiring quantifiers makes no sense.
P> If I dont have the /g switch, the regex operates ONE TIME on the
P> scalar. If /g (regardless of quantifiers) it operates on the entire
P> scalar, as many times as it matches. And if it reset to the beginning
P> of the scalar AFTER each match, it would even work the same way as s/\n
P> \n*/\n/;

you don't get quantifiers then. they modify a single regex thing to
their left. s/\n+/\n/g is the simplest way to get what you want with
s///. note no grabbing is needed since you know the replacement will
just be a single \n. there are several variants with assertions i won't
go into.

uri

--
Uri Guttman ------ uri(a)stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------

From: Jim Gibson on 4 Aug 2010 18:32

In article <87y6cmgimn.fsf(a)quad.sysarch.com>, Uri Guttman
<uri(a)StemSystems.com> wrote:

> >>>>> "P" == P <misterperl(a)gmail.com> writes:
>
> P> On Aug 4, 9:59�am, Tad McClellan <ta...(a)seesig.invalid> wrote:
> >> Mr P <misterp...(a)gmail.com> wrote:
> >> > I wanted to replace any and all multiple sequential EOLs with a single
> >> > one. This worked nicely:
> >>
> >> > (a) �s/(\n)\n+/$1/g;
> >>
> >> You don't need regular expressions to do that:
> >>
> >> � � tr/\n/\n/s;
> >> or
> >> � � tr/\n//s;
> >>
> P> That's quite a surprising solution to me as I would not have thought
> P> of tr// - I will monkey with it thanks.. I always thought of tr// as a
> P> 1:1 mapping which is not what this is, so it seems un-natrual.
>
> no, it seems very natural if you rtfm! it can also delete chars with the
> /d option. both /s and /d are very useful and much faster than the
> equivilent s/// ops.

Sure, but poor old tr is a second-class citizen when it comes to
documentation. Look at 'perldoc -f tr'. Then please explain how tr ( or
s/// for that matter) is a "quote-like operator". You have to remember
to search for 'Transliterates' to get to the documentation for tr :(

I would extract the section in 'perldoc perlop' and put it in 'perldoc
-f tr'.

--
Jim Gibson

From: Jim Gibson on 4 Aug 2010 18:37

In article
<bad8265c-3ba5-40a3-9f74-531b10ac7925(a)x25g2000yqj.googlegroups.com>, Mr
P <misterperl(a)gmail.com> wrote:

> On Aug 4, 9:59�am, Tad McClellan <ta...(a)seesig.invalid> wrote:
> > Mr P <misterp...(a)gmail.com> wrote:
> > > I wanted to replace any and all multiple sequential EOLs with a single
> > > one. This worked nicely:
> >
> > > (a) �s/(\n)\n+/$1/g;
> >
> > You don't need regular expressions to do that:
> >
> > � � tr/\n/\n/s;
> > or
> > � � tr/\n//s;
> >
> That's quite a surprising solution to me as I would not have thought
> of tr// - I will monkey with it thanks.. I always thought of tr// as a
> 1:1 mapping which is not what this is, so it seems un-natrual.

Yes, tr can be used as n:1 or 1:0 mappings, not just 1:1.
>
> >
> > I don't see how assertions can help with this problem...
>
> It seemed like a natural solution to me.
>
> start with /n/n/n
> s/(\n)\n/$1/ now you have \n\n.
> Which would MATCH again, IF the engine started at the scalar
> beginning (hence my use of the word RETRACE, and my thought that this
> was a LOOK-Behind case)..

If you want to redo a regular expression from the beginning of the
string and keep repeating it until it fails to match, put it in a while
loop:

while( s/(\n)\n/$1/g ) {
;
}

However, you have already been given the simplest solution using
regexes:

s/\n{2,}/\n/g;

which fixes your string in one pass, as do all the look-around
variations.

--
Jim Gibson

First | Prev | Next | Last
Pages: 1 2 3
Prev: looking for Hamming+BCH+Reed - Solomon Codes in perl
Next: FAQ 4.52 How do I sort an array by (anything)?