From: Yoavo on
Hi,

We are trying to read an XML file as text file (in "C").
The problem is when the text is unicode - we get garbage.
(for ASCII file it works OK).

Here is our "C" code:
----------------------------

FILE *fp;
TCHAR buff[1000];

fp = _tfopen(_T("C:\\IniTest\\test.xml"), _T("r, ccs=UNICODE"));
if (fp != NULL)
{
_fgetts(buff, 1000, fp) ;
fclose(fp);
}


Thanks,

Yoav

From: r_z_aret on
On Mon, 25 Jan 2010 12:03:06 +0200, "Yoavo" <yoav(a)cimatron.co.il>
wrote:

>Hi,
>
>We are trying to read an XML file as text file (in "C").
>The problem is when the text is unicode - we get garbage.
>(for ASCII file it works OK).
>
>Here is our "C" code:
>----------------------------
>
> FILE *fp;
> TCHAR buff[1000];
>
> fp = _tfopen(_T("C:\\IniTest\\test.xml"), _T("r, ccs=UNICODE"));
> if (fp != NULL)
> {
> _fgetts(buff, 1000, fp) ;
> fclose(fp);
> }
>


If UNICODE is defined when you compile, then TCHAR will be WCHAR
(UNICODE), and your program as written will read UNICODE files, but
not ASCII files

If UNICODE is not defined when you compile, then TCHAR (and thus buff)
will be char (ASCII), and your program as written will read ASCII
files, but not UNICODE files.

You have two choices:
1) Explicitly declare a char array that you use for ASCII files _and_
a WCHAR file that you use for UNICODE files
2) Declare only one, and then use MultiByteToWideChar or
WideCharToMultiByte to translate as needed.

Determining whether a file is ASCII or UNICODE may be tricky. If the
file is UNICODE and follows convention, it will start with a BOM (byte
order marker).

If you don't understand UNICODE, TCHAR, etc., I _strongly_ recommend
taking time to learn. Other wise you will waste a lot of your time
tracking down strange systems. You can start by using google
(http://groups.google.com/advanced_search?q=&) to look up
byte order mark
in this newsgroup. I just did, and got 14 hits, at least some of which
look useful.


>
>Thanks,
>
>Yoav

-----------------------------------------
To reply to me, remove the underscores (_) from my email address (and please indicate which newsgroup and message).

Robert E. Zaret, eMVP
PenFact, Inc.
20 Park Plaza, Suite 400
Boston, MA 02116
www.penfact.com
Useful reading (be sure to read its disclaimer first):
http://catb.org/~esr/faqs/smart-questions.html