From: Greg Lovern on
If I save an Excel file as "Unicode Text" (xlUnicodeText), I get a tab-
delimited UTF-16 (also known as UTF-7) unicode file.

How can I convert it to UTF-8?


I've been doing the conversion by automating Word from Excel:

Const WORD_TEXT_FORMAT As Long = 2 'FileFormat:=wdFormatText
Const WORD_UTF8_ENCODING As Long = 65001 'Encoding:=65001

ObjWordDoc.SaveAs _
Filename:="save as UTF-8.txt", _
FileFormat:=WORD_TEXT_FORMAT, _
Encoding:=WORD_UTF8_ENCODING


However, now I need to convert from UTF-16 to UTF-8 on computers that
may not have Word installed. Any suggestions?


Thanks,

Greg
From: joel on

I 've done this type stuff before. It may be a little tricky because
the byte order of the 16 bit data may be reversed. I don't understand
how UTF-16=UTF-7. UTF-16 is two byte data without any parity. UTF-7 is
7 bit data which may or may no contain a parity bit. UTF-8 is 8 bit
data with no parity so you can get 256 characters.

I don't understand what tab delimited UTF-16 means. I would need to
see a sample of the data. Tab delimited I've only seen as 7 or 8 bit
data. whatt happens to the data if you open it in NOTEPAD? Notepad
will automatically convert UTF-16 to UTF-8 (it may look a little
funny).

Usually the UTF-16 problems aresolved opening the data file as binary
in VBA and then writing some code to change the format. I usually open
two files from VBA (one input binary and one output no binary) if I
don't need the final results in a spreadsheet. Otherwise, I open just
one file as binary input and read the data into excel worksheet.


--
joel
------------------------------------------------------------------------
joel's Profile: 229
View this thread: http://www.thecodecage.com/forumz/showthread.php?t=192392

http://www.thecodecage.com/forumz

 | 
Pages: 1
Prev: Search & Replace
Next: subtract times