From: Nisse =?utf-8?Q?Engstr=C3=B6m?= on
On Fri, 28 May 2010 16:52:09 -0400, tedd wrote:

> At 8:52 PM +0200 5/28/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote:
>>On Fri, 28 May 2010 11:13:35 -0400, tedd wrote:
>>
>> > As is my understanding, UTF-8 will accommodate all the languages
>>> (glyphs) of the world and then some. It will be a while before we
>>> need UTF-16 or UTF-32 but those are just a larger super-sets.

Again:

>>The theoretical limits are:
>>
>> UTF-8 [0 - 7fffffff]
>> UTF-16 [0 - 10ffff]
>> UTF-32 [0 - ffffffff]

In what way are UTF-16 and -32 super-sets of UTF-8?

>>Also, there are many, many, *many* more glyphs than
>>characters (code point) in the world. As an example,
>>www.fonts.com lists 165,125 fonts. Every one has a
>>*different* glyph for the characer "A"...

> As you say, UTF-8 has a range of 0 to 7FFFFFFF

No, I said that's the theoretical range. It is restricted
to [0-10ffff] according to current specifications.

> If you spend some time looking at the numerous char sets that Unicode
> offers you will see that just about every symbol known to man has
> been cataloged

Yes. (Except those that are missing).

> every language in the world and glyph known to man has been
> included -- a truly massive project.

No. There are no glyphs in Unicode. This is spelled out for
you in chapter 2, figure 2-2. "Characters versus Glyphs".


/Nisse
From: tedd on
At 7:15 AM +0200 5/29/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote:
>
>No. There are no glyphs in Unicode. This is spelled out for
>you in chapter 2, figure 2-2. "Characters versus Glyphs".

*blink* *blink* *blink*

I read it, but that's not addressing the issue here -- that's
something different.

You are not understanding the difference between characters, fonts,
glyphs, and code points.

Here are some definitions taken directly from a Unicode Standard that
might help:

-- quote

Character. The smallest component of written language that has
semantic-value; refers to the abstract meaning and/or shape, rather
than a specific shape (see also glyph), though in code tables some
form of visual representation is essential for members understanding.

Font. A collection of glyphs used for the visual depiction of
character data. A font is often associated with a set of parameters
(for example, size posture, weight, and serifness), which, when set
to particular values, generates a collection of imaginable glyphs.

Glyph. (1) An abstract for that represents one or more glyph images.
(2) A synonym for "glyph image". In displaying Unicode character
data, one or more glyphs may be selected to depict a particular
character. These glyphs are selected by a rendering engine during
composition and layout processing.

-- unquote

As such, you cannot claim "There are no glyphs in Unicode" for that is silly.

Code points are simply unique numbers assigned to specific characters
in an approved char set. To better understand which character is
represented a representative Glyph is used -- what else would we use,
a chicken?

I may have been liberal in my use of the term "Glyph" in previous
brief email, but "Glyph" in Unicode has a special meaning. The Glyph
'A' is 'A' regardless of if it is Helvetical or Times, bold or
italic, 12pt or 24pt glyph. Likewise the Yin-Yang symbol is a Glyph
that has a single code point regardless of if it is red and black or
green and blue glyph. But the point is -- there is a unique code
point (041 HEX) for the Latin 'A' Glyph and one unique code point
(262F HEX) for the Miscellaneous Symbols Yin-Yang Glyph -- WITH -- a
representative Glyph in the Unicode table defining each code point!

So, when I say that just about every Glyph in the world has been
provided a code point I am basically and technically correct --
excepting of course those glyphs that are not considered appropriate
for inclusion or are variation glyphs of the representative Glyph
that is already included -- understand?

After all is said and done, what is Unicode all about? It is
assigning a universal and unique code point system to Glyphs that are
considered to be appropriate representative members of abstract
written forms of communication. But of course those are Glyphs for
what else could they be?

Cheers,

tedd

--
-------
http://sperling.com http://ancientstones.com http://earthstones.com
From: Nisse =?utf-8?Q?Engstr=C3=B6m?= on
On Sat, 29 May 2010 10:16:39 -0400, tedd wrote:

> At 7:15 AM +0200 5/29/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote:
>>
>>No. There are no glyphs in Unicode. This is spelled out for
>>you in chapter 2, figure 2-2. "Characters versus Glyphs".

> Code points are simply unique numbers assigned to specific characters
> in an approved char set. To better understand which character is
> represented a representative Glyph is used -- what else would we use,

Right. I should have phrased that differently.

> a chicken?

U+9e21 ? U+540D ?


/Nisse
From: tedd on
At 10:20 PM +0200 5/29/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote:
>On Sat, 29 May 2010 10:16:39 -0400, tedd wrote:
>
>> At 7:15 AM +0200 5/29/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote:
>>>
>>>No. There are no glyphs in Unicode. This is spelled out for
>>>you in chapter 2, figure 2-2. "Characters versus Glyphs".
>
>> Code points are simply unique numbers assigned to specific characters
>> in an approved char set. To better understand which character is
>> represented a representative Glyph is used -- what else would we use,
>
>Right. I should have phrased that differently.
>
>> a chicken?
>
>U+9e21 ? U+540D ?

LOL

I forgot that the word chicken appears in several other languages as
a single character. Interesting to note that in the Chinese
Dictionary, the character "U+9e21" Chicken (ji) is interchangeable
with prostitution.

Cheers,

tedd

--
-------
http://sperling.com http://ancientstones.com http://earthstones.com
From: "Angus Mann" on
Dear Sir/Madam

Please unsubscribe Angus Mann angusmann(a)pobox.com from your database. My
husband passed away 6 May 2010.

Thank you
Sonya Mann


----- Original Message -----
From: "tedd" <tedd.sperling(a)gmail.com>
To: <php-general(a)lists.php.net>
Sent: Monday, May 31, 2010 12:20 AM
Subject: Re: [PHP] Convert UTF-8 to PHP defines


> At 10:20 PM +0200 5/29/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote:
>>On Sat, 29 May 2010 10:16:39 -0400, tedd wrote:
>>
>>> At 7:15 AM +0200 5/29/10, Nisse =?utf-8?Q?Engstr=C3=B6m?= wrote:
>>>>
>>>>No. There are no glyphs in Unicode. This is spelled out for
>>>>you in chapter 2, figure 2-2. "Characters versus Glyphs".
>>
>>> Code points are simply unique numbers assigned to specific characters
>>> in an approved char set. To better understand which character is
>>> represented a representative Glyph is used -- what else would we use,
>>
>>Right. I should have phrased that differently.
>>
>>> a chicken?
>>
>>U+9e21 ? U+540D ?
>
> LOL
>
> I forgot that the word chicken appears in several other languages as a
> single character. Interesting to note that in the Chinese Dictionary, the
> character "U+9e21" Chicken (ji) is interchangeable with prostitution.
>
> Cheers,
>
> tedd
>
> --
> -------
> http://sperling.com http://ancientstones.com http://earthstones.com
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>