From: ab` on
I'm working on an app that (among other things) writes batch files.
Presently, the batch files are UTF-8 encoded, with BOM. The batch files
work properly unless they include non-ANSI characters.

For example, this doesn't work:
"MyProg.exe" "的翻.data"

Nor does this (a crude attempt to force UTF-8):
chcp 65001
"MyProg.exe" "的翻.data"

I also tried forcing the parameter into 8-bit, which results in output
like this (also doesn't work):
"MyProg.exe" "çš„ç¿».data"

Any ideas on how to best approach this problem?
From: David Ching on
"ab`" <ab(a)absent.com> wrote in message
news:uZa98Wa2KHA.5212(a)TK2MSFTNGP04.phx.gbl...
> I'm working on an app that (among other things) writes batch files.
> Presently, the batch files are UTF-8 encoded, with BOM. The batch files
> work properly unless they include non-ANSI characters.
>

Load the .bat file into Notepad and save it with UTF-16. Windows is native
UTF-16, so it might work with batch files, but no guarantees.

-- Dvaid


From: ab` on
I tried using UTF-16LE and UTF16-BE, unfortunately got the same results.
From: Liviu on
"ab`" <ab(a)absent.com> wrote...
> I'm working on an app that (among other things) writes batch files.

And this is related to MFC exactly how?

> Presently, the batch files are UTF-8 encoded, with BOM. The batch
> files work properly unless they include non-ANSI characters.

Cmd doesn't like Unicode batch files. I am in fact a bit surprised that
it ignored the UTF-8 BOM even if the rest of the file was pure ASCII.

> For example, this doesn't work:
> "MyProg.exe" "??.data"

Try one of the following...

(a) Pass the Unicode name on the command line when calling the batch.
Inside the batch use "%~1" or similar to pass it along to "MyProg.exe".

(b) Retrieve the name in a 'for' loop inside the batch, instead of
hardcoding it, something like 'for %f in (*.data) do "MyProg.exe" "%f"'.

(c) Find an alternative to batch files.

Liviu



From: Mihai N. on
> I'm working on an app that (among other things) writes batch files.
> Presently, the batch files are UTF-8 encoded, with BOM. The batch files
> work properly unless they include non-ANSI characters.

The command line does not properly support Unicode batch files.
And most of the command line tools don't understand Unicode either.
The only solution is to try something else than the regular cmd.
For instance PowerShell, or VB/JavaScript scripts.



--
Mihai Nita [Microsoft MVP, Visual C++]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email