From: Dr.Ruud on
John W. Krahn schreef:
> Robbie Hatley wrote:

>> 1. I had a "\" before the "$" to prevent "$_" from being
>> interpolated.
>
> That just adds a '\' character to your character class:
>
> $ perl -le'$x = q{[$_]}; print qr{$x}'
> (?-xism:[$_])
> $ perl -le'$x = q{[\$_]}; print qr{$x}'
> (?-xism:[\$_])

But it won't match a '\':

$ perl -wle'$x = q{[\$_]}; print $x; print length($x); print q{\\} =~
/$x/ ? 1 : 0'
[\$_]
5
0

$ perl -wle'$x = q{[\\$_]}; print $x; print length($x); print q{\\} =~
/$x/ ? 1 : 0'
[\$_]
5
0

$ perl -wle'$x = q{[\\\$_]}; print $x; print length($x); print q{\\} =~
/$x/ ? 1 : 0'
[\\$_]
6
1

$ perl -wle'$x = q{[\$_]}; print $x; print length($x); print chr(92) =~
/$x/ ? 1 : 0'
[\$_]
5
0

$ perl -wle'$x = q{[\\$_]}; print $x; print length($x); print chr(92) =~
/$x/ ? 1 : 0'
[\$_]
5
0

$ perl -wle'$x = q{[\\\$_]}; print $x; print length($x); print chr(92)
=~ /$x/ ? 1 : 0'
[\\$_]
6
1

(was run with a perl 5.8.5)

--
Affijn, Ruud

"Gewoon is een tijger."

From: Ben Bullock on
On Mon, 14 Apr 2008 10:40:57 -0700, Robbie Hatley wrote:

> "Ben Bullock" wrote:

>> ... [a-z0-9-]{3,63} (ignoring case) is enough. Your regex will get
>> things which aren't valid URLs. The following catches anything valid:
>>
>> my $validdns = '[0-9a-z-]{2,63}';
>> m/\b(($validdns\.){1,62}$validdns)\b/i # Catches any valid thing.
>
> I can see that your pattern looks for just the dns part of the url,
> which has fewer valid characters; but since it doesn't look for "/", it
> will convert this string:
>
> references in Sec 35.74 paragraph B
>
> to
>
> references in Sec http://35.74 paragraph B
>
> I believe you're right in that it will find most valid dns strings; but
> it also catches things that aren't part of URLs at all (such as numbers
> with decimal points), and it rejects certain well-formed domain strings
> (such as "j.qbc.net.ca", which fails the "{2,63}" assertion).

Well OK but if I was going to do this for real, I would use something like

/\b(($validdns\.){1,62}(com|net|org|us|uk|ca|jp))\b/i

or similar (I haven't checked this regex with the machine yet but
hopefully you get the picture).

> My pattern at least insists on "stuff.stuff/stuff", so it rejects
> "35.74". It rejects domain-level URLs and only linkifys document-level
> URLs. That may be a blessing or a curse, depending on your
> expectations.

I hadn't really thought this through carefully, I just wanted to make the
point that the &$% stuff is not valid as part of the web address.

> Also, both your pattern and my are broken in that they match
> http://www.asdf.com/qwer.html, and indeed convert it to
> http://http://www.asdf.com/qwer.html .

Mine doesn't do anything at all, I'm not sure it even compiles!
From: Dr.Ruud on
Robbie Hatley schreef:

> Today I was editing a URL-likifying program I wrote several
> weeks ago, and I ran across some issues with q{} and qr{}
> which are puzzling me.

Consider Regexp::Common.

--
Affijn, Ruud

"Gewoon is een tijger."