|
From: AK on 10 Apr 2008 09:46 Hello, Does the standard C string library support utf-8 charset basically operations like string length, string compare, string copy, string search? Can I use the C conversion functions like wcstombs and mbstowcs to convert string between wide and utf-8 format? I requires the information for an application that I am developing for Windows CE using Visual Studio 2005. Thanks & Regards, Ajith
From: Carl Daniel [VC++ MVP] on 10 Apr 2008 10:03 AK wrote: > Hello, > > Does the standard C string library support utf-8 charset > basically operations like string length, string compare, string copy, > string search? No, the Standard C library knows nothing at all about UTF-8. The VC++ CRT has extensions that are "MBCS-aware": See _mbslen, etc. that are prototyped in <mbstring.h> > > Can I use the C conversion functions like wcstombs and > mbstowcs to convert string between wide and utf-8 format? These are not standard C functions either, but VC++ extensions. Yes, you can use them to convert between UTF-8 and UCS-2 by using an appropriate code page (CP_UTF8 or 65001) on the utf-8 side. > > I requires the information for an application that I am > developing for Windows CE using Visual Studio 2005. You'll have to check the documentation to see what of the above works on Win CE. -cd
From: Igor Tandetnik on 10 Apr 2008 11:07 Carl Daniel [VC++ MVP] <cpdaniel_remove_this_and_nospam(a)mvps.org.nospam> wrote: >> Does the standard C string library support utf-8 charset >> basically operations like string length, string compare, string copy, >> string search? > > No, the Standard C library knows nothing at all about UTF-8. > > The VC++ CRT has extensions that are "MBCS-aware": See _mbslen, etc. > that are prototyped in <mbstring.h> .... none of which, unfortunately, support UTF-8 either. They all assume no more than two bytes per character. >> Can I use the C conversion functions like wcstombs and >> mbstowcs to convert string between wide and utf-8 format? > > These are not standard C functions either, but VC++ extensions. Yes, > you can use them to convert between UTF-8 and UCS-2 by using an > appropriate code page (CP_UTF8 or 65001) on the utf-8 side. .... except that neither wcstombs nor mbstowcs takes code page as a parameter. You may be thinking about MultiByteToWideChar and WideCharToMultiByte APIs. -- With best wishes, Igor Tandetnik With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea. It is hard to be sure where they are going to land, and it could be dangerous sitting under them as they fly overhead. -- RFC 1925
From: AK on 10 Apr 2008 11:07 Hello Carl, So are you saying that the standard C String functions like strlen, strstr, strcmp, strcp etc wouldn't work with utf-8? Also how do I specify to the mbcs functions the charset encoding that it has to operate on . I see a mbcs functions with _l suffix that takes in locale info. Could you tell me as to how to use the _locale_t structure to say that the encoding is utf-8 (CP_UTF8).I couldn't find enough description in msdn as to how the _locale_t can be used. I would appreciate if you can give some code sample. Thanks & Regards, AK
From: AK on 10 Apr 2008 11:29 Hello Igor, Yes, the Win32 equivalents MultiByteToWideChar and WideCharToMultiByte does properly convert between utf-8 and wide char. But I mostly to try to use the standard C/C++ libraries to keep the code portable ready. My application receieves a lot of text payload in utf-8 over the network. I need to do lot of manipulation and substitution before I sent them back over the network again. Currently I do it by converting it to wide char first and then using wcs*** functions or the STL based std::wstring to peform the necesary operation. I wanted to do all text manipulation without converting them to utf-8. Is this possible? Also does the STL std:string support utf-8? Regards, AK
|
Next
|
Last
Pages: 1 2 3 Prev: reimplementing another interface,is it possible? Next: auto_ptr in loop |