From: Lew Pitcher on
On June 16, 2010 18:39, in comp.unix.shell, harryooopotter(a)hotmail.com
wrote:

> On Jun 16, 1:13 pm, Lew Pitcher <lpitc...(a)teksavvy.com> wrote:
> [...]
>> Also, Janis' suggestion of using dd(1) is good. However, dd(1) seems to
>> only recognize two EBCDIC variants, and doesn't specify /which/ variants
>> they are. My guess is EBCDIC-INT and EBCDIC-US, so if you find that your
>> input.txt contains some offbeat EBCDIC (like EBCDIC-JP-KANA, for
>> instance), you are likely going to be out of luck with dd(1).
>
>> Lew Pitcher
>
> This is what the first line look in in hex.
>
> $ head -1 input.txt | od -x
> 0000000 2e2e 5fcc 2e25 c1ce cbca 3fd1 2e3e 2e2e
> 0000020 2e2e 2e2e 2e2e cb2e 2f3f c1f8 ce3e e12e
> 0000040 ce3e 25c1 f83f 2ec1 5fcc 3e25 2ecb 3fcb
> 0000060 f82f 3ec1 2ece c72e c8c8 2ef8 2e2e c4cb
> 0000100 c1c7 2f5f 2ecb 5fcc 0a0d
> 0000112

OK, I can't find any EBCDIC that makes sense from that hex dump. Are you
certain that the file is EBCDIC? How do you know? For that matter, how did
you get the file in the first place?

From the dump, it looks like, perhaps, the data is binary. I see repeating
binary structures there (look for the 5fcc), and the file /might/ be a pure
binary data dump.

FWIW, the od -x (and hexdump -x) show data words with the bytes in reverse
order. This means that your file contains
2e 2e cc 5f 25 2e ce c1 ca cb d1 3f 3e 2e 2e 2e
2e 2e 2e 2e 2e 2e 2e cb 3f 2f f8 c1 3e ce 2e e1
3e ce c1 25 ef f8 c1 2e cc 5f 25 3e cb 2e cb 3f
2f f8 c1 3e ce 2e 2e c7 c8 c8 f8 2e 2e 2e cb c4
c7 c1 5f 2f cb 2e cc 5f 0d 0a

First off, obviously not ASCII: ASCII only carries characters in the 00 - 7f
range, and all those c* and f* characters are outside the range of ASCII.

Next, if it is an EBCDIC variant, it contains a number of unusual control
characters. 2e is the ACK control character in all EBCDICs, and 0a is an
SS2 ("Single Shift 2") control character (0d is "Carriage Return", as it is
in ASCII, and 25 is "Line Feed").

If EBCDIC, then it isn't EBCDIC-INT (CP038) or EBCDIC-US; the file contains
values that aren't legal characters in either EBCDIC variant (ca, cb, ce).

If it is EBCDIC-CP-US (CP037), then those ca/cb/cc/ce/ef values correspond
to "Soft Hyphen" (ca), "Latin small letter O with circumflex" (cb), "Latin
small letter O with Diaeresis" (cc), "Latin small letter O with Acute" (ce)
and "Latin capital letter O with Tilde" (ef). From the patterns of these
characters in the data, it doesn't look like you have an EBCDIC-CP-US text
here, either. Hmmmmmm.....

Rather than go through all the variants of EBCDIC, we probably should
examine how the file was produced, how you got it, and how you know that it
is EBCDIC. Perhaps there are clues there.


> And iconv can understand neirther ebdic nor cp038 ...
>
> $ iconv -l | grep -i ebcdic
> $ iconv -l | grep -i cp038
> $
>
> So I could not use dd and iconv.
> Anyone has any other suggestions ?
>

--
Lew Pitcher
Master Codewright & JOAT-in-training | Registered Linux User #112576
Me: http://pitcher.digitalfreehold.ca/ | Just Linux: http://justlinux.ca/
---------- Slackware - Because I know what I'm doing. ------


From: Harry on
On Jun 16, 4:16 pm, Lew Pitcher <lpitc...(a)teksavvy.com> wrote:
[...]
> OK, I can't find any EBCDIC that makes sense from that hex dump. Are you
> certain that the file is EBCDIC? How do you know? For that matter, how did
> you get the file in the first place?
[...]

The file content was from a MQ message received on a MQ Manager
sitting on zOS; the MQ Manager has a setting CCSID(37) (just found
out by checking the Q Manager setting) which is COM EUROPE
EBCDIC according to IBM Web site.

> From the dump, it looks like, perhaps, the data is binary. I see repeating
> binary structures there (look for the 5fcc), and the file /might/ be a pure
> binary data dump.

Perhaps the message is not EBCDIC.
I am just trying to decode the message.

From: Lew Pitcher on
On June 17, 2010 01:44, in comp.unix.shell, harryooopotter(a)hotmail.com
wrote:

> On Jun 16, 4:16 pm, Lew Pitcher <lpitc...(a)teksavvy.com> wrote:
> [...]
>> OK, I can't find any EBCDIC that makes sense from that hex dump. Are you
>> certain that the file is EBCDIC? How do you know? For that matter, how
>> did you get the file in the first place?
> [...]
>
> The file content was from a MQ message received on a MQ Manager
> sitting on zOS; the MQ Manager has a setting CCSID(37) (just found
> out by checking the Q Manager setting) which is COM EUROPE
> EBCDIC according to IBM Web site.
>
>> From the dump, it looks like, perhaps, the data is binary. I see
>> repeating binary structures there (look for the 5fcc), and the file
>> /might/ be a pure binary data dump.
>
> Perhaps the message is not EBCDIC.
> I am just trying to decode the message.
>

It's been a while since I last worked with MQ (about 7 or 8 years). My guess
is that you've got, at least in part, a dump of the MQ messages, including
the headers. I don't have my MQ manuals handy, to interpret the data
fields, so I can't be certain.

/If/ you have such messages, a straight characterset conversion (like iconv
or dd) won't be as much help as you'd like. There will be binary data (at
least in the header, if not in the message itself) that such characterset
conversion tools will not properly handle (they will convert the data as if
it were characters from the source characterset, rather than leave the
binary values alone, as binary values).

My suggestion would be to get a bit more info about how the mainframe people
created the file, and how it landed on your unix system (are you using USS
or the zOS "unix" facilities?). Remember, file transfer utilities sometimes
perform a "mass characterset conversion" for you, so if you (for instance)
used ftp with the "ascii" option, your data is no longer in EBCDIC, and no
longer reflective of an MQ data structure.

Luck be with you
--
Lew Pitcher
Master Codewright & JOAT-in-training | Registered Linux User #112576
Me: http://pitcher.digitalfreehold.ca/ | Just Linux: http://justlinux.ca/
---------- Slackware - Because I know what I'm doing. ------


From: realto on
On Jun 16, 7:16 pm, Lew Pitcher <lpitc...(a)teksavvy.com> wrote:

Lew Pitcher is a domain thief.

For further info, checkout http://lewpitcher.ca