|
Prev: ggg supplier ( paypal accept ) ( www.sneaker-fan.com ) ggg shirt supplier ( paypal accept ) ( www.sneaker-fan.com )
Next: Fake Louis Vuitton Mini Sac Fake HandBags
From: byang on 16 Apr 2008 00:09 Hi, I am wondering how can read/write UTF-8 files with C++. Say, I know there is a file encoded with UTF-8, I am now wanting to change some character in the file. How can I achieve? Could anybody here to help on explaining some encoding issue? Thanks in advance! Regards! Bo
From: thomas.mertes on 16 Apr 2008 06:46
On 16 Apr., 06:09, byang <techr...(a)eyou.com> wrote: > Hi, > I am wondering how can read/write UTF-8 files with C++. Say, I know > there is a file encoded with UTF-8, I am now wanting to change some > character in the file. How can I achieve? Could anybody here to help on > explaining some encoding issue? While UTF-8 has a lot of advantages it also has a disadvantage: The relationship between byte position and char position is not a simple relationship like: char_position * 4 = byte_position. In most cases it is not possible to go to a file position (with fseek) and to write the new character. In the general case there is no other possibility than to read the whole file and to write it with the change. BTW.: Seed7 has a function to open an UTF-8 file. After opening an UTF-8 file with 'open_utf8' it can be used as a normal file. When reading from the UTF-8 encoded file the characters are converted to the UTF-32 encoding. Internally only UTF-32 characters and strings are used. A write operation to a file opened with 'open_utf8' converts the UTF-32 characters back to UTF-8. In Seed7 the seek and change method is also not useable since seek uses byte positions and not character positions. But at least the read + change + write solution is simple. Greetings Thomas Mertes Seed7 Homepage: http://seed7.sourceforge.net Seed7 - The extensible programming language: User defined statements and operators, abstract data types, templates without special syntax, OO with interfaces and multiple dispatch, statically typed, interpreted or compiled, portable, runs under linux/unix/windows. |