From: Jeff Caton on
I have to make an import function for a different program whose string
encoding/ decoding I don't really understand yet.
Some parts of the encoded string is ASCII, but some are not.
For example the German character "�" is encoded by 5C (I looked it up in
a hex editor), which would be 92. I don't have any idea which encoding
they could have used to get the value 92 for this character.
Any ideas?
From: Jim Mack on
Jeff Caton wrote:
> I have to make an import function for a different program whose
> string encoding/ decoding I don't really understand yet.
> Some parts of the encoded string is ASCII, but some are not.
> For example the German character "�" is encoded by 5C (I looked it
> up in a hex editor), which would be 92. I don't have any idea which
> encoding they could have used to get the value 92 for this
> character. Any ideas?

There's no encoding scheme I'm aware of that reuses the ASCII
codepoints for non-ASCII characters. What is the source of these
strings -- what OS, etc?

Have you tried decoding these as UTF-8? That's the most common scheme
you'll encounter in the wild.

--
Jim Mack
Twisted tees at http://www.cafepress.com/2050inc
"We sew confusion"

From: Jeff Caton on
Sorry, I made a mistake, the code was a different one...
From: Helmut Meukel on
"Jeff Caton" <j.caton(a)gmailnotspam.com> schrieb im Newsbeitrag
news:ekmJFvl%23KHA.4308(a)TK2MSFTNGP04.phx.gbl...
>I have to make an import function for a different program whose string
>encoding/ decoding I don't really understand yet.
> Some parts of the encoded string is ASCII, but some are not.
> For example the German character "�" is encoded by 5C (I looked it up in a hex
> editor), which would be 92. I don't have any idea which encoding they could
> have used to get the value 92 for this character.
> Any ideas?


Jeff,

looks like the old pre-DOS 7-bit ASCII in its german version.
With 7 bits you had no other chance for foreign language characters as to
use codes already defined in ASCII for square brackets, backslash, ...
This was normed by ISO.
IIRC, 10 characters from the original ASCII were reserved for national
characters: codes 5B to 5F and 7B to 7F.
The german ISO set used only 8 (for ��ܧ����) thus { [ \ | ] } were
unavailable on printers using the german ISO set. Can't recall the other 2
characters. Some printers could use 2 pre-defined character sets and you
could switch between both using the control codes SI and SO.

Helmut.


From: Dee Earley on
On 23/05/2010 10:35, Jeff Caton wrote:
> I have to make an import function for a different program whose string
> encoding/ decoding I don't really understand yet.
> Some parts of the encoded string is ASCII, but some are not.
> For example the German character "�" is encoded by 5C (I looked it up in
> a hex editor), which would be 92. I don't have any idea which encoding
> they could have used to get the value 92 for this character.
> Any ideas?

92 is decimal, 5C is the hex value of 92.
Both of these however are the \ character.

--
Dee Earley (dee.earley(a)icode.co.uk)
i-Catcher Development Team

iCode Systems

(Replies direct to my email address will be ignored.
Please reply to the group.)