From: Paul Lalli on
Peter J. Holzer wrote:
> On 2006-10-03 12:12, Paul Lalli <mritty(a)gmail.com> wrote:
> > Dr.Ruud wrote:
> >> Ian Wilson schreef:
> >>
> >> > \d matches "0", "1" ... "8" or "9"
> >>
> >> Last time I checked, \d matched 268 different characters. Dear
> >> programmer, if you mean [0-9], then write [0-9].
> >
> > Er. Huh? I realize that \w will match not only 'a'..'z', 'A'..'Z',
> > '0'..'9', and _, and that all the "international" letters such as á
> > and Ñ are included as well, depending on locale. But other than the
> > ten characters Ian implied, what else does \d match?
>
> The digits in all the non-latin scripts. Try:

It absolutely never even occurred to me that other characters would be
considered digits. Like I said, I'm depressingly un-informed about
locales and internationalization. Thanks for the information.

Paul Lalli

From: Ian Wilson on
Dr.Ruud wrote:
> Ian Wilson schreef:
>
>
>>\d matches "0", "1" ... "8" or "9"
>
> Last time I checked, \d matched 268 different characters.

Both the above statements are true :-)
All 268 are characters, all are digits, few are numeric!

> Dear programmer, if you mean [0-9], then write [0-9].

No one has really followed up on this in the context set by the OP.

Assuming that some program writes a decimal checksum to a file and that
checksum contains non-ASCII numerals, would Perl arithmetic do the
right thing?

-----------------------8<-----------------------------
#!/usr/bin/perl
#
use warnings;
use strict;

checksum('foo 1234 bar');
checksum("fie \x{0101} fum");
checksum("baz \x{0661}\x{0662}\x{0663}\x{0664} qux");

sub checksum {
my $text = shift;
if ($text =~ /(\d+)/) {
print "$1 + 1 = ", $1+1, "\n";
} else {
print "no numbers in '$text' \n";
}
}
-----------------------8<-----------------------------
$ perl -v
This is perl, v5.8.0 built for i386-linux-thread-multi

$ perl numbers.pl
1234 + 1 = 1235
Wide character in print at numbers.pl line 15.
no numbers in 'fie ? fum'
Argument "\x{661}\x{662}..." isn't numeric in addition (+) at numbers.pl
line 13.
Wide character in print at numbers.pl line 13.
???? + 1 = 1

(Actually the last line looked different before I cut & pasted it, it
ended " + 1 = 1")

Why doesn't perl handle any unicode digit named "XXXX DIGIT NINE" as
numerically equivalent to DIGIT NINE?