Prev: This spanish character string "�" cause something that I don't understand
Next: about encoding UTF-8 and UTF-16
From: Harlan Messinger on 31 Mar 2010 07:55
Tony Johansson wrote:
> Here I encode the spanish character "�" to UTF-8 which is encoded as a two
> bytes with the values 195 and 177 which is understandable.
> As we know a char is a Unicode which is a signed 16-bits integer.
> Now to my question when I run this program and use the debugger and hover
> over this ch variabel that is of type char
> it shows 241.
> I mean because a char is Unicode(UTF-16) and this value is using two bytes
> when UTF-8 is used how can the debugger show 241 when I hover over this ch
> variable ?
Since the characters is represented in memory as UTF-16, why would the
debugger show you what it would be in UTF-8?
The UTF-16 representation for all Unicode characters with values less
than 65536 is the straightforward 16-bit integer representation of the
value. This isn't the case in UTF-8.