From: Timothy Madden on
tf wrote:
> Timothy Madden wrote:
>> I need to write some wrapper classes around a library that my client has,
>> and the error messages (and all the other strings in the library) are in
>> UTF-8. Can I somehow create an exception class derived from
>> std::exception
>> (std::runtime_error) that could carry such messages ?
>
> Don't worry too much about the what() message. It's nice to have a
> message that a programmer stands a chance of figuring out, but you're
> very unlikely to be able to compose a relevant and user-comprehensible
> error message at the point an exception is thrown. Certainly,
> internationalization is beyond the scope of the exception class
> author. Peter Dimov makes an excellent argument that the proper use
> of what() string is to serve as a key into a table of error message
> formatters. Now if only we could get standardized what() strings for
> exceptions thrown by the standard library...
>
> -- http://www.boost.org/community/error_handling.html
>

The what() message is the means to carry the error string from my
library to any outer context that contains the catch() clause. And
without the outer context having to know my specific exception class.

So I just find it necessary.

Thank you,
Timothy Madden

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Timothy Madden on
Goran wrote:
> On Jul 18, 12:28 am, Timothy Madden <terminato...(a)gmail.com> wrote:
>> Hello
>>
>> I need to write some wrapper classes around a library that my client has,
>> and the error messages (and all the other strings in the library) are in
>> UTF-8. Can I somehow create an exception class derived from std::exception
>> (std::runtime_error) that could carry such messages ?
>>
>> I mean the message returned std::exception::what() is assumed to be in the
>> application locale, and I can not just set the application locale to UTF-8.
>
> If standard library and other librarries you use aren't localized,
> then they are most likely in English, and that's OK for plain UTF-8.
> So when you output what() to something UTF-8 aware, it's OK.
>
> If they are localized, and are using specific locale (not UTF-8),
> whoops! How about some simple mix-in derivation, e.g.:
>
> class utf8_error
> {
> virtual const char* what_utf8() const = 0;
> }
>
> then,
>
> class my_error : public runtime_error, public utf8_error
> {
> // Implement what and what_utf8
> };
>
> and finally, in you catch handlers, use:
>
> string utf8_ed_what(const exception& e)
> {
> const utf8_error* utf8 = dynamic_cast<const utf8_error*>(&e);
> if (utf8)
> return utf8->what_utf8();
> else
> return locale_text_to_utf8(e.what());
> }
>
> BTW, application locale is assumed? How? (Honest question).


Yes, maybe this would currently be the only practical work-around to
this rather theoretical problem.

The thing is that I, like other programmers, am not too found of
dynamic_cast and run-time type identification.

So what I did was to just put the UTF-8 string in the std::exception,
and have my error reporting function, invoked from catch(), always
decode the string as UTF-8. Essentially I am just hoping that the
standard library and other libraries use only 7-bit ASCII what()
messages in exceptions, which are compatible with UTF-8.

About assuming the application locale for what() strings, the idea is
the string would be human-readable, so it would be possible to output it
to stdout, which implies the string would have the charset from the
current locale.

However what the standard says (18.6.1.8) is:

virtual const char* what() const throw();

Returns: An implementation-defined NTBS.
Notes: The message may be a null-terminated multibyte string
(17.3.2.1.3.2), suitable for conversion and display as a wstring (21.2,
22.2.1.5).

Where NTBS stands for null-terminated byte string. The last reference
(22.2.1.5) is for codecvt<internT,externT,stateT> class template, and
the only codecvt<> instantiation required by the standard, that performs
a conversion, "convert(s) the implementation-defined native character
set" between wchar_t and char.

I am unsure what the "native character set" would be, but I guess the
current locale would match it.

Thank you,
Timothy Madden
--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Mathias Gaunard on
On Jul 22, 1:38 am, Timothy Madden <terminato...(a)gmail.com> wrote:

> My problem is the "right charset", as you call it, would be the
> /execution wide-character set/

That's not what I called the right charset in my message.
It was a reference to the execution character set, the narrow non-wide
one. I thought you wanted to convert from UTF-8 to that.


> whereas std::exception::what() only
> returns a narrow string. Converting my UTF-8 messages to narrow strings
> in the application locale would loose extended characters (messages are
> in Korean, and the application locale might be different than Korean).
>
> So the point is I would need to carry and output the message as a wide
> string, which std::exception lacks.

Then convert the result of what() from UTF-8 to the execution wide
character set when you want to output it as a wide string...


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Timothy Madden on
Mathias Gaunard wrote:
[...]
>
> Then convert the result of what() from UTF-8 to the execution wide
> character set when you want to output it as a wide string...

The problem is the catch site can get exceptions from many other
places (libraries) with messages in the current charset (not UTF-8).
This is why simply applying an UTF-8 decode on the what() string in
the catch clause would be inappropriate.

Thank you,
Timothy Madden

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

From: Mathias Gaunard on
On Jul 25, 4:18 pm, Timothy Madden <terminato...(a)gmail.com> wrote:
> Mathias Gaunard wrote:
>
> [...]
>
>
>
> > Then convert the result of what() from UTF-8 to the execution wide
> > character set when you want to output it as a wide string...
>
> The problem is the catch site can get exceptions from many other
> places (libraries) with messages in the current charset (not UTF-8).
> This is why simply applying an UTF-8 decode on the what() string in
> the catch clause would be inappropriate.

Catch those exceptions and translate them to UTF-8 as they go out of
their library boundaries?


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]