From: Anthony Ss on
Hi,

Today I came across an issue with a customer custom report which was out
by 1 char over 40 or so lines. At first thought I had incorrectly
limited the field length, however the problem is only present where
there is a '£' char.

For example:
"1234".length => 4
"1234£".length => 6 (Expect 5)
"1234£6".length => 7 (Expect 6)

Tested on:
ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
and:
ruby 1.8.6 (2009-03-31 patchlevel 368) [x86_64-linux]

I could not find anything in google covering this (Perhaps my google-fu
needs work) which brought me here.

Is this expected functionality in ruby? It does not seem right in my
mind.

Thanks. :)
--
Posted via http://www.ruby-forum.com/.

From: botp on
On Fri, Jun 4, 2010 at 6:19 PM, Anthony Ss
<anthony(a)anthonystenhouse.co.uk> wrote:
> For example:
>  "1234".length => 4
>  "1234£".length => 6 (Expect 5)
>  "1234£6".length => 7 (Expect 6)

can't help you there, but fyi

> RUBY_VERSION
=> "1.9.2"
> "1234".length
=> 4
> "1234£".length
=> 5
> "1234£6".length
=> 6

kind regards -botp

From: Brian Candler on
Anthony Stenhouse wrote:
> Hi,
>
> Today I came across an issue with a customer custom report which was out
> by 1 char over 40 or so lines. At first thought I had incorrectly
> limited the field length, however the problem is only present where
> there is a '£' char.
>
> For example:
> "1234".length => 4
> "1234£".length => 6 (Expect 5)
> "1234£6".length => 7 (Expect 6)

In UTF-8, "£" is two bytes, and ruby 1.8 gives you the number of bytes.

If you want to capture (say) the first 6 characters of the string, try
this:

>> a = "1234£6789"
=> "1234\302\2436789"
>> a =~ /\A(.{6})/u
=> 0
>> puts $1
1234£6
=> nil

This may be sufficient for simple wrapping functions. Or look at the
Iconv library.

> Is this expected functionality in ruby? It does not seem right in my
> mind.

ruby 1.9 works in characters. It brings with it enormous complexity,
pitfalls and inconsistencies. Pick your poison :-)
--
Posted via http://www.ruby-forum.com/.

From: MrZombie on
On 2010-06-04 06:19:09 -0400, Anthony Ss said:

> Hi,
>
> Today I came across an issue with a customer custom report which was out
> by 1 char over 40 or so lines. At first thought I had incorrectly
> limited the field length, however the problem is only present where
> there is a '�' char.
>
> Is this expected functionality in ruby? It does not seem right in my
> mind.
>
> Thanks. :)

Do yourself a favor, friend, and read this excellent article on
Character Encoding:

http://www.joelonsoftware.com/articles/Unicode.html

Then, find out if the character encoding for your file and your
interpreter is the same. :P
--
Thank you for your brain.
-MrZombie

From: botp on
On Fri, Jun 4, 2010 at 6:47 PM, Brian Candler <b.candler(a)pobox.com> wrote:
> If you want to capture (say) the first 6 characters of the string, try
>>> a = "1234£6789"
> => "1234\302\2436789"
>>> a =~ /\A(.{6})/u
> => 0
>>> puts $1
> 1234£6
> => nil

>"1234£6789"[0..5]
=> "1234£6"

> ruby 1.9 works in characters. It brings with it enormous complexity,
> pitfalls and inconsistencies. Pick your poison :-)

All programmers are optimists. -Frederick Brooks, Jr.

;-)

kind regards -botp