|
From: Dr.Ruud on 14 Apr 2008 18:33 John W. Krahn schreef: > Robbie Hatley wrote: >> 1. I had a "\" before the "$" to prevent "$_" from being >> interpolated. > > That just adds a '\' character to your character class: > > $ perl -le'$x = q{[$_]}; print qr{$x}' > (?-xism:[$_]) > $ perl -le'$x = q{[\$_]}; print qr{$x}' > (?-xism:[\$_]) But it won't match a '\': $ perl -wle'$x = q{[\$_]}; print $x; print length($x); print q{\\} =~ /$x/ ? 1 : 0' [\$_] 5 0 $ perl -wle'$x = q{[\\$_]}; print $x; print length($x); print q{\\} =~ /$x/ ? 1 : 0' [\$_] 5 0 $ perl -wle'$x = q{[\\\$_]}; print $x; print length($x); print q{\\} =~ /$x/ ? 1 : 0' [\\$_] 6 1 $ perl -wle'$x = q{[\$_]}; print $x; print length($x); print chr(92) =~ /$x/ ? 1 : 0' [\$_] 5 0 $ perl -wle'$x = q{[\\$_]}; print $x; print length($x); print chr(92) =~ /$x/ ? 1 : 0' [\$_] 5 0 $ perl -wle'$x = q{[\\\$_]}; print $x; print length($x); print chr(92) =~ /$x/ ? 1 : 0' [\\$_] 6 1 (was run with a perl 5.8.5) -- Affijn, Ruud "Gewoon is een tijger."
From: Ben Bullock on 14 Apr 2008 19:34 On Mon, 14 Apr 2008 10:40:57 -0700, Robbie Hatley wrote: > "Ben Bullock" wrote: >> ... [a-z0-9-]{3,63} (ignoring case) is enough. Your regex will get >> things which aren't valid URLs. The following catches anything valid: >> >> my $validdns = '[0-9a-z-]{2,63}'; >> m/\b(($validdns\.){1,62}$validdns)\b/i # Catches any valid thing. > > I can see that your pattern looks for just the dns part of the url, > which has fewer valid characters; but since it doesn't look for "/", it > will convert this string: > > references in Sec 35.74 paragraph B > > to > > references in Sec http://35.74 paragraph B > > I believe you're right in that it will find most valid dns strings; but > it also catches things that aren't part of URLs at all (such as numbers > with decimal points), and it rejects certain well-formed domain strings > (such as "j.qbc.net.ca", which fails the "{2,63}" assertion). Well OK but if I was going to do this for real, I would use something like /\b(($validdns\.){1,62}(com|net|org|us|uk|ca|jp))\b/i or similar (I haven't checked this regex with the machine yet but hopefully you get the picture). > My pattern at least insists on "stuff.stuff/stuff", so it rejects > "35.74". It rejects domain-level URLs and only linkifys document-level > URLs. That may be a blessing or a curse, depending on your > expectations. I hadn't really thought this through carefully, I just wanted to make the point that the &$% stuff is not valid as part of the web address. > Also, both your pattern and my are broken in that they match > http://www.asdf.com/qwer.html, and indeed convert it to > http://http://www.asdf.com/qwer.html . Mine doesn't do anything at all, I'm not sure it even compiles!
From: Dr.Ruud on 15 Apr 2008 16:53 Robbie Hatley schreef: > Today I was editing a URL-likifying program I wrote several > weeks ago, and I ran across some issues with q{} and qr{} > which are puzzling me. Consider Regexp::Common. -- Affijn, Ruud "Gewoon is een tijger."
First
|
Prev
|
Pages: 1 2 Prev: FAQ 9.17 How do I check a valid mail address? Next: anyone has done this kind of perl/CGI? |