From: Jeff Johnson on
"Arne Vajh�j" <arne(a)vajhoej.dk> wrote in message
news:4ba2d0c2$0$279$14726298(a)news.sunsite.dk...

> I would not consider Unicode an encoding.

Uh, why? An encoding is simply a means of associating a set of bytes with
the characters they represent. That's what Unicode does.


From: Harlan Messinger on
Jeff Johnson wrote:
> "Arne Vajh�j" <arne(a)vajhoej.dk> wrote in message
> news:4ba2d0c2$0$279$14726298(a)news.sunsite.dk...
>
>> I would not consider Unicode an encoding.
>
> Uh, why? An encoding is simply a means of associating a set of bytes with
> the characters they represent. That's what Unicode does.

It isn't an encoding in the binary sense because it only assigns
characters to numbers, it doesn't specify a representation. It doesn't
specify, for example, whether "A" should be represented as 41 or 0041 or
00000041 (or something else), or whether an em-dash would be 2014 or
002014 or 00002014 (or something else).
From: Peter Duniho on
Jeff Johnson wrote:
> "Arne Vajh�j" <arne(a)vajhoej.dk> wrote in message
> news:4ba2d0c2$0$279$14726298(a)news.sunsite.dk...
>
>> I would not consider Unicode an encoding.
>
> Uh, why? An encoding is simply a means of associating a set of bytes with
> the characters they represent. That's what Unicode does.

I believe Arne's point is that "Unicode" by itself does not describe a
way to encode characters as bytes. There are specific encodings within
Unicode (as part of the standard): UTF-8, UTF-16, and UTF-32. But
Unicode by itself describes a collection of valid characters, not how
they are encoded as bytes.

Pete
From: Jeff Johnson on
"Peter Duniho" <no.peted.spam(a)no.nwlink.spam.com> wrote in message
news:uIZGlx3xKHA.5364(a)TK2MSFTNGP05.phx.gbl...

>>> I would not consider Unicode an encoding.
>>
>> Uh, why? An encoding is simply a means of associating a set of bytes with
>> the characters they represent. That's what Unicode does.
>
> I believe Arne's point is that "Unicode" by itself does not describe a way
> to encode characters as bytes. There are specific encodings within
> Unicode (as part of the standard): UTF-8, UTF-16, and UTF-32. But Unicode
> by itself describes a collection of valid characters, not how they are
> encoded as bytes.

Ah. I just go with the convention that "Unicode" by itself, at least in the
..NET world, means UTF-16LE.


From: Arne Vajhøj on
On 19-03-2010 09:16, Jeff Johnson wrote:
> "Arne Vajh�j"<arne(a)vajhoej.dk> wrote in message
> news:4ba2d0c2$0$279$14726298(a)news.sunsite.dk...
>> I would not consider Unicode an encoding.
>
> Uh, why? An encoding is simply a means of associating a set of bytes with
> the characters they represent. That's what Unicode does.

No.

Unicode is a mapping between the various symbols and a number.

Encoding is the mapping between the number and 1-many bytes.

Arne