From: Ammar Ali on
[Note: parts of this message were removed to make it a legal post.]

On Sat, Jul 17, 2010 at 3:03 PM, David A. Black <dblack(a)rubypal.com> wrote:

> James's text file has some non-printing (Word-derived?) characters,
> instead of regular spaces:
>

Those are nonbreak spaces (U+00A0, 0xC2A0) that should be treated as \W.

What's odd is that when I try to scan these lines, I get different
> results depending on whether I'm on the command line or in TextMate.
>

I thought the CRLF line endings might have something to do with it, but the
result was the same. Another clue, with 1.9.1-p378, the result from TextMate
was correct, identical to that of the command line.

Ammar

From: David A. Black on
Hi --

On Sat, 17 Jul 2010, Ammar Ali wrote:

> On Sat, Jul 17, 2010 at 3:03 PM, David A. Black <dblack(a)rubypal.com> wrote:
>
>> James's text file has some non-printing (Word-derived?) characters,
>> instead of regular spaces:
>>
>
> Those are nonbreak spaces (U+00A0, 0xC2A0) that should be treated as \W.
>
> What's odd is that when I try to scan these lines, I get different
>> results depending on whether I'm on the command line or in TextMate.
>>
>
> I thought the CRLF line endings might have something to do with it, but the
> result was the same. Another clue, with 1.9.1-p378, the result from TextMate
> was correct, identical to that of the command line.

Thanks for checking. It turns out to be an encoding thing: TextMate
invokes Ruby with -KU. Without the -KU (which involved editing an
underlying script file, as well as the Bundle Editor entry, but then
again I'm not a bit TextMate bundle expert), it ran the same as the
unadorned command line.


David

--
David A. Black, Senior Developer, Cyrus Innovation Inc.

The Ruby training with Black/Brown/McAnally
Compleat Philadelphia, PA, October 1-2, 2010
Rubyist http://www.compleatrubyist.com