From: Eugene Gershnik on
David J. Littleboy wrote:
>> How
>> the Unicode sequence in its editor is converted to this encoding is
>> up to the application but a reasonable user expectation is that what
>> looks like ?
>> should be transmitted as ?.
>
> One man's "reasonable user expectation" is another's unacceptable
> abomination. Just because you can't see a reason for transmitting as
> two characters doesn't mean there isn't one. In particular, there are
> a lot of combining characters in Unicode, most of which can't be
> encoded in "Western European".

And how is it relevant here? We are talking about a character that *is*
present in ISO 8859-1 not any arbitrary character. It has two possible
encodings in Unicode wich both map to the same one in ISO 8859-1. The
normalization forms of Unicode are equivalent and encoding conversion
should
not depend on which one was used as a source. To provide C++ context
correct
encoding conversion should look something like

basic_string<uchar> source = ...;
basic_string<uchar> form_c = normalize_unicode(source);
string result = convert(get_encoding("ISO 8859-1"), source);

where uchar is UTF character of your favorite size or even wchar_t if it
stores UTF on your platform.


> So there simply isn't any general
> solution to the problem.

There is. See above.

> Again, that's _your_ desire. There are a lot of other users out
> there. Some of us speak an Oriental language or two, and realize that
> all bets are off if you change encodings.

Some of us speak multiple languages too and realize that the above is
wrong.
If my Unicode text contains _only characters compatible with the target
encoding_ all bets are not off.

--
Eugene



[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: loufoque on
Pete Becker wrote :

> WCHAR has to be 2 bytes and store UTF-16 in little-endian format,

I fail to see how the Win32 API can handle UTF-16.
It looks like it can only do UCS-2.



[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

First  |  Prev  | 
Pages: 3 4 5 6 7 8 9 10 11 12 13
Prev: localtime deprecated?
Next: bind guard ?