From: Ben Morrow on

Quoth Martijn Lievaart <m(a)rtij.nl.invlalid>:
> On Mon, 07 Jun 2010 17:21:07 +0100, Ben Morrow wrote:
> > Quoth Martijn Lievaart <m(a)rtij.nl.invlalid>:
> >> On Sat, 05 Jun 2010 21:33:20 +0100, Ben Morrow wrote:
> >>
> >> > This is a bad plan. Locales (specifically, the 'locale' pragma) and
> >> > Unicode don't play nicely together in Perl, and if you're processing
> >> > international text you will probably end up with Unicode strings. A
> >>
> >> Can you expand on this? What exactly goes wrong (or is unexpected)?
> >
> (snip)
> >
> > Confused yet? :)
>
> That's just plain buggy I would say, or is there some logic I don't see?

No, it's just plain buggy. The bugs have been there since 5.8.0, they
are well known, and the only reason they haven't been fixed yet is
because it's extremely difficult (both to work out what the behaviour
*should* be, and to write the actual code). The problem *is* currently
being worked on (mostly by Karl Williamson), but don't hold your breath.

> Besides, your examples did not work for me completely, to get the same
> regex matching I had to set LANG as well.

You probably had LANG set in your environment already, which I don't.
IIRC LANG overrides LC_ALL.

Ben

From: Martijn Lievaart on
On Mon, 07 Jun 2010 18:27:08 +0100, Ben Morrow wrote:

>> That's just plain buggy I would say, or is there some logic I don't
>> see?
>
> No, it's just plain buggy. The bugs have been there since 5.8.0, they
> are well known, and the only reason they haven't been fixed yet is
> because it's extremely difficult (both to work out what the behaviour
> *should* be, and to write the actual code). The problem *is* currently
> being worked on (mostly by Karl Williamson), but don't hold your breath.

Thx for the info.

/me writes down after "never use threads in Perl", "never use locales in
Perl".

I get the problem, but it does suck.

>
>> Besides, your examples did not work for me completely, to get the same
>> regex matching I had to set LANG as well.
>
> You probably had LANG set in your environment already, which I don't.
> IIRC LANG overrides LC_ALL.
>

Ah, that explains it.

M4