|
From: spydox on 14 Apr 2008 13:08 I'm trying to find a repeated number in a string, like 122345 finds 22. This works: /(\d)\1/ This doesn't: /\1(\d)/ I guess LLR parsing is to blame, but shouldn't the second example first try to FIND a $1 then check to see if there is a \1, and repeat that process moving L to R? I though Perl sort of went to and fro trying to do matching. To me, there IS a /\1(\d)/ in the string since $1 is 2, and there is a \1 = 2 preceeding it. I was a little surprized this didn't work although I can sort of see why in a way too. In some ways it seems to me that regexes should be *disconnected* from parsing - just answer the question does this match?
From: Ben Morrow on 14 Apr 2008 14:31 Quoth spydox(a)gmail.com: > > I'm trying to find a repeated number in a string, like 122345 finds > 22. > > This works: > > /(\d)\1/ > > This doesn't: > > /\1(\d)/ > > I guess LLR parsing is to blame, but shouldn't the second example > first try to FIND a $1 then check to see if there is a \1, and repeat > that process moving L to R? > > I though Perl sort of went to and fro trying to do matching. To me, > there IS a /\1(\d)/ in the string since $1 is 2, and there is a \1 = 2 > preceeding it. There are two separate operations here which you are confusing. First perl parses the regex itself, and compiles it into an internal form. Then it matches that regex against the string you provide. The second will backtrack, under some circumstances; the first won't. Ben
From: A. Sinan Unur on 14 Apr 2008 14:37 spydox(a)gmail.com wrote in news:093bf887-729d-4400-8750- 6c91b21b478e(a)w4g2000prd.googlegroups.com : > I'm trying to find a repeated number in a string, like 122345 > finds 22. > > This works: > > /(\d)\1/ > > This doesn't: > > /\1(\d)/ > > I guess LLR parsing is to blame, .... > I was a little surprized this didn't work although I can sort of > see why in a way too. In some ways it seems to me that regexes > should be *disconnected* from parsing - just answer the question > does this match? I don't look at this as a parsing issue. Rather, it is a "the universe must make sense" kind of issue: The first match does not exist before the first match. That makes sense to me. It may not make sense to you. Sinan -- A. Sinan Unur <1usa(a)llenroc.ude.invalid> (remove .invalid and reverse each component for email address) comp.lang.perl.misc guidelines on the WWW: http://www.rehabitation.com/clpmisc/
From: spydox on 14 Apr 2008 14:51 On Apr 14, 2:31 pm, Ben Morrow <b...(a)morrow.me.uk> wrote: > Quoth spy...(a)gmail.com: > > > > > > > I'm trying to find a repeated number in a string, like 122345 finds > > 22. > > > This works: > > > /(\d)\1/ > > > This doesn't: > > > /\1(\d)/ > > > I guess LLR parsing is to blame, but shouldn't the second example > > first try to FIND a $1 then check to see if there is a \1, and repeat > > that process moving L to R? > > > I though Perl sort of went to and fro trying to do matching. To me, > > there IS a /\1(\d)/ in the string since $1 is 2, and there is a \1 = 2 > > preceeding it. > > There are two separate operations here which you are confusing. First > perl parses the regex itself, and compiles it into an internal form. > Then it matches that regex against the string you provide. The second > will backtrack, under some circumstances; the first won't. > > Ben Understood, and I appreciate the insight. It makes sense. Yet, when all else apparently *fails*, in my experience, and I've heard MJD and others say this, Perl will "do its best" to match. To me, unless it *also* tried backtracking, it gave up too soon..
From: spydox on 14 Apr 2008 14:57 .. .. .. > > > I guess LLR parsing is to blame, > .. .. > > I don't look at this as a parsing issue. Rather, it is a "the > universe must make sense" kind of issue: The first match does not > exist before the first match. That makes sense to me. It may not > make sense to you. > To me, like conventional pattern-recognition, of say two tanks next to each other, the system should accept it whether the match is described either way: find a tank with another identical tank to it's left *or* find a tank with another identical tank to it's right The system should have no *context-sensitivity* where only one of the two matches. Sure, internally an algorithm may be scanning L to R or R to L or whatever, but the user should not even be concerned with that, at least in this case. I still think it gave up too soon- it should have tried R to L (backtracking) when L to R failed. Just IMHO, thank-you for your thoughts. This area seems just a bit gray to me I'd be very interested in Damain or Mark's thoughts.
|
Next
|
Last
Pages: 1 2 3 Prev: use modules OS dependent Next: FAQ 9.7 How do I make an HTML pop-up menu with Perl? |