All X'0D' lost during reading line sequential file using microfocus se [Cobol]

Prev: Cobol's File System Vs. RDBMs...
Next: COBOL NUMVAL issues

From: Pete Dashwood on 30 Jul 2008 06:34

<taoxianfeng(a)gmail.com> wrote in message
news:7cd6c4c1-3929-418e-b211-9c7e86abd8f3(a)a3g2000prm.googlegroups.com...
On Jul 30, 2:37 pm, "Pete Dashwood"
<dashw...(a)removethis.enternet.co.nz> wrote:
> <taoxianf...(a)gmail.com> wrote in message
>
> news:51a1467d-8e2d-41ab-926b-6732b313957f(a)w7g2000hsa.googlegroups.com...
> On Jul 29, 2:59 pm, "William M. Klein" <wmkl...(a)nospam.netcom.com>
> wrote:
>
>
>
>
>
> > <taoxianf...(a)gmail.com> wrote in message
>
> >news:71e6da43-8789-4de7-8fc7-54f14fb69dbf(a)34g2000hsh.googlegroups.com...
> > <snip>
>
> > The DB2 codepage is set to IBM-943 (Japanese) so a SQL2754N "cannot be
> > converted" error happens when trying to load data with codepage 1252.
> > Maybe I should change the DB codepage?
>
> > Is the actual mainframe data "NATIONAL" (or DBCS) data (stored in the
> > DB2
> > table?
> > If so, then it is CERTAINLY possible that the actual DBCS/national data
> > includes
> > X"OD" bytes within a double byte (or Unicode) data.
>
> > "Converting" (or handling) mainframe DBCS/National data via Micro Focus
> > on
> > AIX
> > is a VERY different issue than anything that you have mentioned up to
> > now.
>
> > If the mainframe data is NOT DBCS or National, can you find out WHY it
> > is
> > defined as "IBM-943 (Japanese)"? If it does include SOME actual Japanese
> > data,
> > can you find out if it is ALL "national" - or if it is a mixture of
> > national and
> > alphanumeric data.
>
> > If the mainframe data includes a combination of EBCDIC and DBCS (or
> > Unicode)
> > data, then I think you need to be VERY careful of your "conversion"
> > (export)
> > procedures AND you need to make certain that "conversions" in transfer
> > to
> > AIX
> > "maintains" valid data AND that you are using the proper language (NLS
> > and
> > codepage) settings when processing the data with Micro Focus.
>
> > --
> > Bill Klein
> > wmklein <at> ix.netcom.com
>
> I just become despaired since it keeps involving more and more
> questions...
>
> [Pete]
>
> I understand how you feel. I've been watching the thread, but refrained
> from
> comment.
>
> 1. Don't give up.
> 2. Think about what you have gained.
>
> You have a lot more information than you had when you first posted and you
> have found out things that you didn't know.
>
> Some of the information you received has been misleading, but that's
> normal
> on Usenet. People here have been trying hard to help, but the statement of
> the problem is not accurate. While it may be true that your x'0D' s are
> being "stripped out", to most people here that is normal behaviour for a
> Line Sequential file. (that's why it is happening). You didn't tell us
> the file contained Japanese Language characters which could be represented
> in a number of ways, and can contain x'0D' as a matter of course.
>
> Bill's post above is simply addressing this, and he is trying to help you.
> (Trust him, he is wise... :-))
>
> Unfortunately, you still haven't been able to resolve your problem, and
> pressure to do so is mounting.
>
> Rick pointed out the possibility of being able to export the data as
> character format Hex. Very useful.
>
> So now, although it all seems very overwhelming, you are really close to a
> solution. This is not the time to quit or despair... :-)
>
> At the moment it seems that as soon as you can reconstruct the original
> data
> stream from the Hex, you have solved the problem.
>
> How hard can that be?
>
> Robert suggested using a code page (unfortunately, he was a bit off the
> mark, but the idea was good...)
>
> Personally, I wouldn't even attempt to change the code page for the DB;
> that
> is likely to upset a number of people :-).
>
> Think some more about the Hex string. Each byte is represented by 2 hex
> symbols. If you had to, you could easily write a little COBOL routine that
> would give you the same byte in binary... Here's an example that is by no
> means definitive, but which I'm sure you can modify for your particular
> environment...
>
> *> interface
> 01 two-bytes-in pic xx.
> 01 one-byte-out pic x.
>
> *> reference data
> 01 hc pic x(16) value '01234567890ABCDEF'.
> 01 filler redefines hc.
> 12 hexchars pic x occurs 16
> indexed by hc-x1.
> 01 hv usage comp.
> 12 x0 pic s9(4) value zero.
> 12 x1 pic s9(4) value 1.
> 12 x2 pic s9(4) value 2.
> 12 x3 pic s9(4) value 3.
> 12 x4 pic s9(4) value 4.
> 12 x5 pic s9(4) value 5.
> 12 x6 pic s9(4) value 6.
> 12 x7 pic s9(4) value 7.
> 12 x8 pic s9(4) value 8.
> 12 x9 pic s9(4) value 9.
> 12 xA pic s9(4) value 10.
> 12 xB pic s9(4) value 11.
> 12 xC pic s9(4) value 12.
> 12 xD pic s9(4) value 13.
> 12 xE pic s9(4) value 14.
> 12 xF pic s9(4) value 15.
> 01 filler redefines hv.
> 12 hexvalues pic s9(4) comp occurs 16
> indexed by hv-x1.
>
> *> work fields
>
> 01 current-byte pic x.
> 01 num-x pic xx.
> 01 num-b redefines num-x pic s9(4) comp.
> 01 binary-work-fields usage comp.
> 12 bin-work pic s9(4).
> 12 bin-1 pic s9(4).
> 12 bin-2 pic s9(4).
>
> ....
>
> convert-hex-chars section.
> chc000.
> move two-bytes-in (1:1) to current-byte
> perform get-binary
> move bin-work to bin-1
> move two-bytes-in (2:1) to current-byte
> perform get-binary
> move bin-work to bin-2
> compute num-b = (bin-1 * 16) + bin-2
> move num-x (2:1) to one-byte-out
> .
> chc999.
> exit.
> *>--------------------------
> get-binary section.
> gb000.
> set hc-x1 to 1
> search hexchars
> at end
> *> the HEX, isn't... drastic action needed...not shown here
> when current-byte = hexchars (hc-x1)
> set hv-x1 to hc-x1 *> you might need to adjust this on
> MicroFocus
> move hexvalue (hv-x1) to bin-work
> end-search
> .
> gb999.
> exit.
>
> This is necessarily a little unwieldy because MicroFocus COBOL (as far as
> I
> can ascertain) doesn't support PIC 1 for true binary (we really need to
> address 4 bits here), and that means it is necessary to fudge it in 16 bit
> fields.
>
> If you build a little "machine" (like the above) it isn't too hard to push
> your HEX string through it and so obtain the original binary
> representation
> which could be anything, including National Characters, Katakana, DBCS,
> whatever. (Or even just standard ASCII)
>
> Even if you don't go this way but find another solution, never give up
> because people are asking questions. Answer the ones you can, ignore the
> ones you can't or respond with "I don't know"... :-)
>
> You have invested a large amount of time and effort in this.
>
> You are way too close to a solution to despair now :-)
>
> Pete.
> --
> "I used to write COBOL...now I can do anything."- Hide quoted text -
>
> - Show quoted text -

You gave an excellent conclusion.

I'm really a newbie and I gained a lot from this post.

I also think the HEX scalar function is very near to the solution.

I'm busy with some other business so replying a little slowly.

I will keep trying. Thank you very much.

[Pete]

You are very welcome.... :-)
(I was a newbie myself once...)

Pete.
--
"I used to write COBOL...now I can do anything."

From: Pete Dashwood on 30 Jul 2008 11:05

"William M. Klein" <wmklein(a)nospam.netcom.com> wrote in message
news:IfXjk.339886$fz6.206173(a)fe08.news.easynews.com...
>I just want to repeat that if you have a mixture of EBCDIC and National
>(DBCS or Unicode) data in the DB2 table on the mainframe, that you will
>really need a LOT of information to be able to "correctly" migrate this to
>AIX.

That may not be the case, Bill. There has been no suggestion that a MIXTURE
is in use.

Taoxianfeng said they are on a straight Japanese code page.

If the AIX machine can recognise the DBCS, it should all be fine. DBCS is
like Unicode (inasmuch as it comes in standard "flavours"); it's a standard
format for any platform that supports IBM's DBCS.

(I had to use it once many years ago and have never forgotten the
experience...)

Quoting from IBM:

The IBM-932 code page (Japanese) is one example of a DBCS code page in
which:
X'00' to X'7F' are single-byte codes
X'81' to X'9F' are double-byte introducer
X'A1' to X'DF' are single-byte codes
X'E0' to X'FC' are double-byte introducer

(No wonder the stripped out x'0D's are problematic... :-))

I think the problem is arising because the CONTAINER for the data is a LINE
SEQUENTIAL file, which happens to use x'0D' for a special purpose.

The data is there, because when it is HEX encoded it arrives at the AIX
machine correctly.

If the data is converted back to the original, and the AIX machine is also
running the Japanese code page, I believe everything will work correctly.

It isn't the difference in OS that is the problem, it is the transport
layer. (Rick's suggestion to HEX encode it solves this nicely.)

>
> Question 1:
> Do you want to convert the EBCDIC data to ASCII? If so, you may still
> need to find out which EBCDIC code page (there are more than one) is being
> used on the mainframe. You also won't be able to "automatically" convert
> the data - as you will NOT want to use the same routine for converting the
> Japanese data.

Assumes there is EBCDIC data. If it is all Japanese it is DBCS.

>
> Question 2:
> Is the mainframe "Japanese" data in DBCS or Unicode? Do you want it to
> be in Unicode?

Why would that matter? He just wants it to be readable on the AIX machine.
Once that is working, the niceties of DBCS vs Unicode could be explored as a
separate issue. I think this is just complicating the issue.

> IBM mainframe COBOL (and I think - but am not certain DB2) can handle
> EITHER DBCS or Unicode data. There are differences (some minor and some
> major) between these. You will need to make certain that you know which
> format is used on the mainframe AND which format you are supposed to
> create on AIX.

Surely that is covered by the code page? We have established it is Japanese
(DBCS on the mainframe, and the equivalent Japanese page on the AIX machine,
presumably.)

> Again, conversions of this data will need to happen "field by field" as
> you will NOT want to use the same conversion algorithms for this data as
> the originally EBCDIC data.

I have no idea why you would say this... :-)? Keep it simple.

>
> ***
>
> On Windows (but I am not positive it is true on AIX), Micro Focus *does*
> provide facilities for using mixed EBCDIC and IBM mainframe DBCS data "as
> if it were native/Windows" data. This *might* be the easies method for
> doing your conversion/migration work. HOWEVER, it is not recommended for
> "production" work on AIX. Therefore, you would still want to convert the
> mainframe-style data to AIX data (i.e. EBCDIC -> ASCII and DBCS ->
> Unicode) for "production" work.
>

Yes, very fair comment and right on ** IF ** we are dealing with "mixed"
encoding AND the AIX system DOESN'T support DBCS...

> P.S. This is NOT the type of migration that is usually given to a
> "Newbie" so I certainly can understand your frustration.

He has had a lot to deal with. It would be really good if this can be solved
without complicating it any more than absolutely necessary... :-)

Given that he is using Japanese code pages (DBCS) on both systems, can you
see any problem with him simply decoding the HEX encoded message?

Pete.
--
"I used to write COBOL... now I can do anything."

From: Robert on 30 Jul 2008 13:27

On Thu, 31 Jul 2008 03:05:12 +1200, "Pete Dashwood" <dashwood(a)removethis.enternet.co.nz>
wrote:

>
>
>"William M. Klein" <wmklein(a)nospam.netcom.com> wrote in message
>news:IfXjk.339886$fz6.206173(a)fe08.news.easynews.com...
>>I just want to repeat that if you have a mixture of EBCDIC and National
>>(DBCS or Unicode) data in the DB2 table on the mainframe, that you will
>>really need a LOT of information to be able to "correctly" migrate this to
>>AIX.
>
>That may not be the case, Bill. There has been no suggestion that a MIXTURE
>is in use.
>
>Taoxianfeng said they are on a straight Japanese code page.
>
>If the AIX machine can recognise the DBCS, it should all be fine. DBCS is
>like Unicode (inasmuch as it comes in standard "flavours"); it's a standard
>format for any platform that supports IBM's DBCS.
>
>(I had to use it once many years ago and have never forgotten the
>experience...)
>
>Quoting from IBM:
>
>The IBM-932 code page (Japanese) is one example of a DBCS code page in
>which:
>X'00' to X'7F' are single-byte codes
>X'81' to X'9F' are double-byte introducer
>X'A1' to X'DF' are single-byte codes
>X'E0' to X'FC' are double-byte introducer
>
>(No wonder the stripped out x'0D's are problematic... :-))
>
>I think the problem is arising because the CONTAINER for the data is a LINE
>SEQUENTIAL file, which happens to use x'0D' for a special purpose.
>
>The data is there, because when it is HEX encoded it arrives at the AIX
>machine correctly.
>
>If the data is converted back to the original, and the AIX machine is also
>running the Japanese code page, I believe everything will work correctly.
>
>It isn't the difference in OS that is the problem, it is the transport
>layer. (Rick's suggestion to HEX encode it solves this nicely.)

I think the problem is that Micro Focus does not know the input file contains DBCS codes.
It is treating the input as single-byte US-ASCII. You tell it about the codeset using the
codecomp program to create a codeset file and the MFCODESET environment variable to tell
it to use that page at execution time.

http://supportline.microfocus.com/supportline/documentation/books/sx22sp1/sx22indx.htm

From: Robert on 30 Jul 2008 19:23

On Wed, 30 Jul 2008 11:31:38 -0600, "Frank Swarbrick" <Frank.Swarbrick(a)efirstbank.com>
wrote:

>One thing I think has been mentioned in passing but perhaps overlooked is
>doing a database to database load, rather than an export followed by an
>import. I have only been partially paying attention, but is the goal to get
>data that currently exists in DB2 for z/OS in to a DB2 AIX database? If you
>have DB2 9.1 or 9.5 on AIX you should be able to do something like the
>following:
>
>
>DECLARE load_curs CURSOR
> DATABASE <sourcedb>
> USER <source_db_user_name>
> USING <source_db_user_password>
> FOR SELECT * FROM <source_table_name>;
>LOAD FROM load_curs OF CURSOR
> REPLACE INTO <dest_table_name>;
>
>I believe this feature (the DATABASE/USER/USING clauses) is available only
>in version 9. With prior versions you have to set up "data federation" on
>your destination db and have nicknames for your source db tables. More
>complicated, but still possible. Either way your AIX database must be able
>to connect to your z/OS database. If that is not an option then this will
>not work.
>
>I have no idea if this actually meets your requirements, but it's one
>possible option. No Cobol needed, and no export files needed.

Oracle is simpler

create table <dest> as select * from <src>@<sourcedb>;

where sourcedb is defined once:

create database link <sourcedb>
connect to <user> identified by <password> using 'service handle';

If you don't have permission to create a db link, you can use the sqlplus copy command:

copy from <user>/<password>@<sourcedb> create <dest> using select * from <src>;

From: Pete Dashwood on 30 Jul 2008 20:22

"Robert" <no(a)e.mail> wrote in message
news:hs8194lv6og5gvrbh58kv2andqrlbneql6(a)4ax.com...
> On Thu, 31 Jul 2008 03:05:12 +1200, "Pete Dashwood"
> <dashwood(a)removethis.enternet.co.nz>
> wrote:
>
>>
>>
>>"William M. Klein" <wmklein(a)nospam.netcom.com> wrote in message
>>news:IfXjk.339886$fz6.206173(a)fe08.news.easynews.com...
>>>I just want to repeat that if you have a mixture of EBCDIC and National
>>>(DBCS or Unicode) data in the DB2 table on the mainframe, that you will
>>>really need a LOT of information to be able to "correctly" migrate this
>>>to
>>>AIX.
>>
>>That may not be the case, Bill. There has been no suggestion that a
>>MIXTURE
>>is in use.
>>
>>Taoxianfeng said they are on a straight Japanese code page.
>>
>>If the AIX machine can recognise the DBCS, it should all be fine. DBCS is
>>like Unicode (inasmuch as it comes in standard "flavours"); it's a
>>standard
>>format for any platform that supports IBM's DBCS.
>>
>>(I had to use it once many years ago and have never forgotten the
>>experience...)
>>
>>Quoting from IBM:
>>
>>The IBM-932 code page (Japanese) is one example of a DBCS code page in
>>which:
>>X'00' to X'7F' are single-byte codes
>>X'81' to X'9F' are double-byte introducer
>>X'A1' to X'DF' are single-byte codes
>>X'E0' to X'FC' are double-byte introducer
>>
>>(No wonder the stripped out x'0D's are problematic... :-))
>>
>>I think the problem is arising because the CONTAINER for the data is a
>>LINE
>>SEQUENTIAL file, which happens to use x'0D' for a special purpose.
>>
>>The data is there, because when it is HEX encoded it arrives at the AIX
>>machine correctly.
>>
>>If the data is converted back to the original, and the AIX machine is also
>>running the Japanese code page, I believe everything will work correctly.
>>
>>It isn't the difference in OS that is the problem, it is the transport
>>layer. (Rick's suggestion to HEX encode it solves this nicely.)
>
> I think the problem is that Micro Focus does not know the input file
> contains DBCS codes.
> It is treating the input as single-byte US-ASCII.

It certainly looks that way.

>You tell it about the codeset using the
> codecomp program to create a codeset file and the MFCODESET environment
> variable to tell
> it to use that page at execution time.
>
> http://supportline.microfocus.com/supportline/documentation/books/sx22sp1/sx22indx.htm

So, are you saying Robert, that if Taoxianfeng simply sets the MFCODESET EV
to point at the Japanese code page, at the time he runs his MF COBOL
program, it should work? Definitely worth a try...

Pete.
--
"I used to write COBOL...now I can do anything."

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Prev: Cobol's File System Vs. RDBMs...
Next: COBOL NUMVAL issues