From: Dr.Ruud on
Ben Bacarisse schreef:
> Ilya Zakharevich:

>> Let me disagree. First, I know of no such thing as utf-8. Second,
>> if you mean utf8
>
> The proper form is UTF-8 (i.e. with caps) so your correction (further
> from the accepted form) seems rather harsh!

Please read

perldoc Encode
perldoc utf8


In a Perl context, 'utf8' is commonly read as the proper subset of
'UTF-8' currently used by Perl.
See also Ilya's news:e1gkrd$2hr$1(a)agate.berkeley.edu

--
Affijn, Ruud

"Gewoon is een tijger."

From: Ben Bacarisse on
On Thu, 13 Apr 2006 15:07:48 +0200, Dr.Ruud wrote:

> Ben Bacarisse schreef:
>> Ilya Zakharevich:
>
>>> Let me disagree. First, I know of no such thing as utf-8. Second,
>>> if you mean utf8
>>
>> The proper form is UTF-8 (i.e. with caps) so your correction (further
>> from the accepted form) seems rather harsh!
>
> Please read
>
> perldoc Encode
> perldoc utf8
>
>
> In a Perl context, 'utf8' is commonly read as the proper subset of
> 'UTF-8' currently used by Perl.

I was rather glib, sorry. It was the (understandably) irritable "I know
of no such thing as utf-8" when the author almost certainly knows about
utf8, utf-8, UTF-8 and their meanings in and out of Perl that caused me to
post too rapidly.

--
Ben.
From: Ilya Zakharevich on
[A complimentary Cc of this posting was sent to
Dr.Ruud
<rvtol+news(a)isolution.nl>], who wrote in article <e1lpgl.1fk.1(a)news.isolution.nl>:
> In a Perl context, 'utf8' is commonly read as the proper subset of
> 'UTF-8' currently used by Perl.

utf8 is a proper SUPERSET of UTF-8. The former is not restricted to
any particular range of non-negative integers; the current
implementation goes 0..0xFFFFFFFFFFFFFFFF (i.e., maximal range of
native unsigned integers currently used in Perl), and there are "free"
bits to extend it to, e.g., 128bit - if Perl is used on architecture
with sizeof(UV) = 128bits.

UTF-8 is "legally" restricted to 0..0x1FFFFF, although technically, it
can cover up to, IIRC, 0..0x1FFFFFFF.

Hope this helps,
Ilya
From: Dr.Ruud on
Ilya Zakharevich schreef:
> [A complimentary Cc of this posting was sent to
> Dr.Ruud

Please don't do that. This is a newsgroup. Even with mailing lists I
wouldn't do that, unless it is specifically requested somehow.

> rvtol:

>> In a Perl context, 'utf8' is commonly read as the proper subset of
>> 'UTF-8' currently used by Perl.
>
> utf8 is a proper SUPERSET of UTF-8.

Yes, sorry. When I wrote that I had a huge headache, that has just left
together with one of my wisdom teeth.


> The former is not restricted to
> any particular range of non-negative integers; the current
> implementation goes 0..0xFFFFFFFFFFFFFFFF (i.e., maximal range of
> native unsigned integers currently used in Perl), and there are "free"
> bits to extend it to, e.g., 128bit - if Perl is used on architecture
> with sizeof(UV) = 128bits.
>
> UTF-8 is "legally" restricted to 0..0x1FFFFF, although technically, it
> can cover up to, IIRC, 0..0x1FFFFFFF.

OK, thanks.


--
Affijn, Ruud

"Gewoon is een tijger."