All X'0D' lost during reading line sequential file using microfocus se [Cobol]

Prev: Cobol's File System Vs. RDBMs...
Next: COBOL NUMVAL issues

From: taoxianfeng on 30 Jul 2008 21:43

On 7$B7n(B30$BF|(B, $B8a8e(B7:20, "William M. Klein" <wmkl...(a)nospam.netcom.com>
wrote:
> I just want to repeat that if you have a mixture of EBCDIC and National (DBCS or
> Unicode) data in the DB2 table on the mainframe, that you will really need a LOT
> of information to be able to "correctly" migrate this to AIX.
>
> Question 1:
> Do you want to convert the EBCDIC data to ASCII? If so, you may still need to
> find out which EBCDIC code page (there are more than one) is being used on the
> mainframe. You also won't be able to "automatically" convert the data - as you
> will NOT want to use the same routine for converting the Japanese data.
>
> Question 2:
> Is the mainframe "Japanese" data in DBCS or Unicode? Do you want it to be in
> Unicode? Shift-Jis, IBM DBCS, or what format on AIX? IBM mainframe COBOL (and
> I think - but am not certain DB2) can handle EITHER DBCS or Unicode data. There
> are differences (some minor and some major) between these. You will need to
> make certain that you know which format is used on the mainframe AND which
> format you are supposed to create on AIX. Again, conversions of this data will
> need to happen "field by field" as you will NOT want to use the same conversion
> algorithms for this data as the originally EBCDIC data.
>
> ***
>
> On Windows (but I am not positive it is true on AIX), Micro Focus *does* provide
> facilities for using mixed EBCDIC and IBM mainframe DBCS data "as if it were
> native/Windows" data. This *might* be the easies method for doing your
> conversion/migration work. HOWEVER, it is not recommended for "production" work
> on AIX. Therefore, you would still want to convert the mainframe-style data to
> AIX data (i.e. EBCDIC -> ASCII and DBCS -> Unicode) for "production" work.
>
> P.S. This is NOT the type of migration that is usually given to a "Newbie" so I
> certainly can understand your frustration. If this is an 'In-house" migration
> project, then you should be able to find in-house expertise to help you. If
> this is something that your organization has "contracted" to do, then I think
> that someone made a commitment that they hadn't properly "sized" before bidding
> on.
>
> --
> Bill Klein
> wmklein <at> ix.netcom.com<taoxianf...(a)gmail.com> wrote in message
>
> news:7cd6c4c1-3929-418e-b211-9c7e86abd8f3(a)a3g2000prm.googlegroups.com...
> On Jul 30, 2:37 pm, "Pete Dashwood"
>
>
>
>
>
> <dashw...(a)removethis.enternet.co.nz> wrote:
> > <taoxianf...(a)gmail.com> wrote in message
>
> >news:51a1467d-8e2d-41ab-926b-6732b313957f(a)w7g2000hsa.googlegroups.com...
> > On Jul 29, 2:59 pm, "William M. Klein" <wmkl...(a)nospam.netcom.com>
> > wrote:
>
> > > <taoxianf...(a)gmail.com> wrote in message
>
> > >news:71e6da43-8789-4de7-8fc7-54f14fb69dbf(a)34g2000hsh.googlegroups.com...
> > > <snip>
>
> > > The DB2 codepage is set to IBM-943 (Japanese) so a SQL2754N "cannot be
> > > converted" error happens when trying to load data with codepage 1252.
> > > Maybe I should change the DB codepage?
>
> > > Is the actual mainframe data "NATIONAL" (or DBCS) data (stored in the DB2
> > > table?
> > > If so, then it is CERTAINLY possible that the actual DBCS/national data
> > > includes
> > > X"OD" bytes within a double byte (or Unicode) data.
>
> > > "Converting" (or handling) mainframe DBCS/National data via Micro Focus on
> > > AIX
> > > is a VERY different issue than anything that you have mentioned up to now.
>
> > > If the mainframe data is NOT DBCS or National, can you find out WHY it is
> > > defined as "IBM-943 (Japanese)"? If it does include SOME actual Japanese
> > > data,
> > > can you find out if it is ALL "national" - or if it is a mixture of
> > > national and
> > > alphanumeric data.
>
> > > If the mainframe data includes a combination of EBCDIC and DBCS (or
> > > Unicode)
> > > data, then I think you need to be VERY careful of your "conversion"
> > > (export)
> > > procedures AND you need to make certain that "conversions" in transfer to
> > > AIX
> > > "maintains" valid data AND that you are using the proper language (NLS and
> > > codepage) settings when processing the data with Micro Focus.
>
> > > --
> > > Bill Klein
> > > wmklein <at> ix.netcom.com
>
> > I just become despaired since it keeps involving more and more
> > questions...
>
> > [Pete]
>
> > I understand how you feel. I've been watching the thread, but refrained from
> > comment.
>
> > 1. Don't give up.
> > 2. Think about what you have gained.
>
> > You have a lot more information than you had when you first posted and you
> > have found out things that you didn't know.
>
> > Some of the information you received has been misleading, but that's normal
> > on Usenet. People here have been trying hard to help, but the statement of
> > the problem is not accurate. While it may be true that your x'0D' s are
> > being "stripped out", to most people here that is normal behaviour for a
> > Line Sequential file. (that's why it is happening). You didn't tell us
> > the file contained Japanese Language characters which could be represented
> > in a number of ways, and can contain x'0D' as a matter of course.
>
> > Bill's post above is simply addressing this, and he is trying to help you.
> > (Trust him, he is wise... :-))
>
> > Unfortunately, you still haven't been able to resolve your problem, and
> > pressure to do so is mounting.
>
> > Rick pointed out the possibility of being able to export the data as
> > character format Hex. Very useful.
>
> > So now, although it all seems very overwhelming, you are really close to a
> > solution. This is not the time to quit or despair... :-)
>
> > At the moment it seems that as soon as you can reconstruct the original data
> > stream from the Hex, you have solved the problem.
>
> > How hard can that be?
>
> > Robert suggested using a code page (unfortunately, he was a bit off the
> > mark, but the idea was good...)
>
> > Personally, I wouldn't even attempt to change the code page for the DB; that
> > is likely to upset a number of people :-).
>
> > Think some more about the Hex string. Each byte is represented by 2 hex
> > symbols. If you had to, you could easily write a little COBOL routine that
> > would give you the same byte in binary... Here's an example that is by no
> > means definitive, but which I'm sure you can modify for your particular
> > environment...
>
> > *> interface
> > 01 two-bytes-in pic xx.
> > 01 one-byte-out pic x.
>
> > *> reference data
> > 01 hc pic x(16) value '01234567890ABCDEF'.
> > 01 filler redefines hc.
> > 12 hexchars pic x occurs 16
> > indexed by hc-x1.
> > 01 hv usage comp.
> > 12 x0 pic s9(4) value zero.
> > 12 x1 pic s9(4) value 1.
> > 12 x2 pic s9(4) value 2.
> > 12 x3 pic s9(4) value 3.
> > 12 x4 pic s9(4) value 4.
> > 12 x5 pic s9(4) value 5.
> > 12 x6 pic s9(4) value 6.
> > 12 x7 pic s9(4) value 7.
> > 12 x8 pic s9(4) value 8.
> > 12 x9 pic s9(4) value 9.
> > 12 xA pic s9(4) value 10.
> > 12 xB pic s9(4) value 11.
> > 12 xC pic s9(4) value 12.
> > 12 xD pic s9(4) value 13.
> > 12 xE pic s9(4) value 14.
> > 12 xF pic s9(4) value 15.
> > 01 filler redefines hv.
> > 12 hexvalues pic s9(4) comp occurs 16
> > indexed by hv-x1.
>
> > *> work fields
>
> > 01 current-byte pic x.
> > 01 num-x pic xx.
> > 01 num-b redefines num-x pic s9(4) comp.
> > 01 binary-work-fields usage comp.
> > 12 bin-work pic s9(4).
> > 12 bin-1 pic s9(4).
> > 12 bin-2 pic s9(4).
>
> > ....
>
> > convert-hex-chars section.
> > chc000.
> > move two-bytes-in (1:1) to current-byte
> > perform get-binary
> > move bin-work to bin-1
> > move two-bytes-in (2:1) to current-byte
> > perform get-binary
> > move bin-work to bin-2
> > compute num-b = (bin-1 * 16) + bin-2
> > move num-x (2:1) to one-byte-out
> > .
> > chc999.
> > exit.
> > *>--------------------------
> > get-binary section.
> > gb000.
> > set hc-x1 to 1
> > search hexchars
> > at end
> > *> the HEX, isn't... drastic action needed...not shown here
> > when current-byte = hexchars (hc-x1)
> > set hv-x1 to hc-x1 *> you might need to adjust this on
> > MicroFocus
> > move hexvalue (hv-x1) to bin-work
> > end-search
> > .
> > gb999.
> > exit.
>
> > This is necessarily a little unwieldy because MicroFocus COBOL (as far as I
> > can ascertain) doesn't support PIC 1 for true binary (we really need to
> > address 4 bits here), and that means it is necessary to fudge it in 16 bit
> > fields.
>
> > If you build a little "machine" (like the above) it isn't too hard to push
> > your HEX string through it and so obtain the original binary representation
> > which could be anything, including National Characters, Katakana, DBCS,
> > whatever. (Or even just standard ASCII)
>
> > Even if you don't go this way but find another solution, never give up
> > because people are asking questions. Answer the ones you can, ignore the
> > ones you can't or respond with "I don't know"... :-)
>
> > You have invested a large amount of time and effort in this.
>
> > You are way too close to a solution to despair now :-)
>
> > Pete.
> > --
> > "I used to write COBOL...now I can do anything."- Hide quoted text -
>
> > - Show quoted text -
>
> You gave an excellent conclusion.
>
> I'm really a newbie and I gained a lot from this post.
>
> I also think the HEX scalar function is very near to the solution.
>
> I'm busy with some other business so replying a little slowly.
>
> I will keep trying. Thank you very much.- $B0zMQ%F%-%9%H$rI=<($7$J$$(B -
>
> - $B0zMQ%F%-%9%H$rI=<((B -

Frankly speaking , I can't anwser your questions since I don't have
much experience on both mainframe and AIX.

I can only agree that this problem is not well "sized" before
commencing. It's too ridiculous and confidential to be talked about
here. Just forget it.

From: taoxianfeng on 30 Jul 2008 22:02

On 7¤ë31¤é, ¤È«e2:31, "Frank Swarbrick" <Frank.Swarbr...(a)efirstbank.com>
wrote:
> One thing I think has been mentioned in passing but perhaps overlooked is
> doing a database to database load, rather than an export followed by an
> import. I have only been partially paying attention, but is the goal to get
> data that currently exists in DB2 for z/OS in to a DB2 AIX database? If you
> have DB2 9.1 or 9.5 on AIX you should be able to do something like the
> following:
>
> DECLARE load_curs CURSOR
> DATABASE <sourcedb>
> USER <source_db_user_name>
> USING <source_db_user_password>
> FOR SELECT * FROM <source_table_name>;
> LOAD FROM load_curs OF CURSOR
> REPLACE INTO <dest_table_name>;
>
> I believe this feature (the DATABASE/USER/USING clauses) is available only
> in version 9. With prior versions you have to set up "data federation" on
> your destination db and have nicknames for your source db tables. More
> complicated, but still possible. Either way your AIX database must be able
> to connect to your z/OS database. If that is not an option then this will
> not work.
>
> I have no idea if this actually meets your requirements, but it's one
> possible option. No Cobol needed, and no export files needed.
>
> Frank

Do you mean load data from mainframe DB to AIX DB directly if they can
be connected?
Yes there would be problems when migrating the data. But now what I
need to do is like that:

export 2 tables (or some fields of it);
sort the exported data by mfsort;
match the sorted file and only output the necessary records;
import the output file back to the table replacing the old ones.

The original sources are JCL and mainframe cobol which are to be
migrated to AIX shell and microfocus cobol.
I'm sorry I describe this whole image so late that maybe misleaded you
all.

Actually it also maybe possible to read the 2 tables in the cobol
program and delete the unmatched ones(almost writing a new source).
But I think it costs too much. Perhaps it's the last method we will
try.

From: taoxianfeng on 30 Jul 2008 22:46

On 7$B7n(B31$BF|(B, $B8aA0(B2:27, Robert <n...(a)e.mail> wrote:
> On Thu, 31 Jul 2008 03:05:12 +1200, "Pete Dashwood" <dashw...(a)removethis.enternet.co.nz>
> wrote:
>
>
>
>
>
>
>
> >"William M. Klein" <wmkl...(a)nospam.netcom.com> wrote in message
> >news:IfXjk.339886$fz6.206173(a)fe08.news.easynews.com...
> >>I just want to repeat that if you have a mixture of EBCDIC and National
> >>(DBCS or Unicode) data in the DB2 table on the mainframe, that you will
> >>really need a LOT of information to be able to "correctly" migrate this to
> >>AIX.
>
> >That may not be the case, Bill. There has been no suggestion that a MIXTURE
> >is in use.
>
> >Taoxianfeng said they are on a straight Japanese code page.
>
> >If the AIX machine can recognise the DBCS, it should all be fine. DBCS is
> >like Unicode (inasmuch as it comes in standard "flavours"); it's a standard
> >format for any platform that supports IBM's DBCS.
>
> >(I had to use it once many years ago and have never forgotten the
> >experience...)
>
> >Quoting from IBM:
>
> >The IBM-932 code page (Japanese) is one example of a DBCS code page in
> >which:
> >X'00' to X'7F' are single-byte codes
> >X'81' to X'9F' are double-byte introducer
> >X'A1' to X'DF' are single-byte codes
> >X'E0' to X'FC' are double-byte introducer
>
> >(No wonder the stripped out x'0D's are problematic... :-))
>
> >I think the problem is arising because the CONTAINER for the data is a LINE
> >SEQUENTIAL file, which happens to use x'0D' for a special purpose.
>
> >The data is there, because when it is HEX encoded it arrives at the AIX
> >machine correctly.
>
> >If the data is converted back to the original, and the AIX machine is also
> >running the Japanese code page, I believe everything will work correctly.
>
> >It isn't the difference in OS that is the problem, it is the transport
> >layer. (Rick's suggestion to HEX encode it solves this nicely.)
>
> I think the problem is that Micro Focus does not know the input file contains DBCS codes.
> It is treating the input as single-byte US-ASCII. You tell it about the codeset using the
> codecomp program to create a codeset file and the MFCODESET environment variable to tell
> it to use that page at execution time.
>
> http://supportline.microfocus.com/supportline/documentation/books/sx2...- $B0zMQ%F%-%9%H$rI=<($7$J$$(B -
>
> - $B0zMQ%F%-%9%H$rI=<((B -

Thank you Robert.

I tried exporting MFCODESET ev directly then recompile and run the
program. But it gives nothing different.

(referencing the codeset No. from http://www.microfocus.com/000/20031001_003_tcm21-6159.pdf
, I used 0081,0939 and 9122)

So I'm trying to use the codecomp utility to build new mapping now.

However, I feel that it's all about ASCII and EBCDIC since the
document says "The Codecomp utility enables you to reconfigure the
_CODESET program for single-byte characters."

From: Robert on 30 Jul 2008 23:25

On Thu, 31 Jul 2008 12:22:03 +1200, "Pete Dashwood" <dashwood(a)removethis.enternet.co.nz>
wrote:

>
>
>"Robert" <no(a)e.mail> wrote in message
>news:hs8194lv6og5gvrbh58kv2andqrlbneql6(a)4ax.com...
>> On Thu, 31 Jul 2008 03:05:12 +1200, "Pete Dashwood"
>> <dashwood(a)removethis.enternet.co.nz>
>> wrote:
>>
>>>
>>>
>>>"William M. Klein" <wmklein(a)nospam.netcom.com> wrote in message
>>>news:IfXjk.339886$fz6.206173(a)fe08.news.easynews.com...
>>>>I just want to repeat that if you have a mixture of EBCDIC and National
>>>>(DBCS or Unicode) data in the DB2 table on the mainframe, that you will
>>>>really need a LOT of information to be able to "correctly" migrate this
>>>>to
>>>>AIX.
>>>
>>>That may not be the case, Bill. There has been no suggestion that a
>>>MIXTURE
>>>is in use.
>>>
>>>Taoxianfeng said they are on a straight Japanese code page.
>>>
>>>If the AIX machine can recognise the DBCS, it should all be fine. DBCS is
>>>like Unicode (inasmuch as it comes in standard "flavours"); it's a
>>>standard
>>>format for any platform that supports IBM's DBCS.
>>>
>>>(I had to use it once many years ago and have never forgotten the
>>>experience...)
>>>
>>>Quoting from IBM:
>>>
>>>The IBM-932 code page (Japanese) is one example of a DBCS code page in
>>>which:
>>>X'00' to X'7F' are single-byte codes
>>>X'81' to X'9F' are double-byte introducer
>>>X'A1' to X'DF' are single-byte codes
>>>X'E0' to X'FC' are double-byte introducer
>>>
>>>(No wonder the stripped out x'0D's are problematic... :-))
>>>
>>>I think the problem is arising because the CONTAINER for the data is a
>>>LINE
>>>SEQUENTIAL file, which happens to use x'0D' for a special purpose.
>>>
>>>The data is there, because when it is HEX encoded it arrives at the AIX
>>>machine correctly.
>>>
>>>If the data is converted back to the original, and the AIX machine is also
>>>running the Japanese code page, I believe everything will work correctly.
>>>
>>>It isn't the difference in OS that is the problem, it is the transport
>>>layer. (Rick's suggestion to HEX encode it solves this nicely.)
>>
>> I think the problem is that Micro Focus does not know the input file
>> contains DBCS codes.
>> It is treating the input as single-byte US-ASCII.
>
>It certainly looks that way.
>
>>You tell it about the codeset using the
>> codecomp program to create a codeset file and the MFCODESET environment
>> variable to tell
>> it to use that page at execution time.
>>
>> http://supportline.microfocus.com/supportline/documentation/books/sx22sp1/sx22indx.htm
>
>So, are you saying Robert, that if Taoxianfeng simply sets the MFCODESET EV
>to point at the Japanese code page, at the time he runs his MF COBOL
>program, it should work? Definitely worth a try...

Yes. Add compiler option DBCS, and pic G instead of pic X. There is also an environment
variable LANG, but it apears to be for NLS (Latin alphabet).

The manual talks about calling _CODESET to translate EBCDIC DBCS to ASCII DBCS. I'm
guessing that involves quotes, commas and other punctuation characters. If embedded 0D and
others below x'20' go away, the need for EBCDIC-ASCII translation should be evident from
viewing the output file.

From: Robert on 31 Jul 2008 00:05

On Wed, 30 Jul 2008 17:37:20 +1200, "Pete Dashwood" <dashwood(a)removethis.enternet.co.nz>
wrote:

>Think some more about the Hex string. Each byte is represented by 2 hex
>symbols. If you had to, you could easily write a little COBOL routine that
>would give you the same byte in binary... Here's an example that is by no
>means definitive, but which I'm sure you can modify for your particular
>environment...
>
>*> interface
>01 two-bytes-in pic xx.
>01 one-byte-out pic x.
>
>*> reference data
>01 hc pic x(16) value '01234567890ABCDEF'.
>01 filler redefines hc.
> 12 hexchars pic x occurs 16
> indexed by hc-x1.
>01 hv usage comp.
> 12 x0 pic s9(4) value zero.
> 12 x1 pic s9(4) value 1.
> 12 x2 pic s9(4) value 2.
> 12 x3 pic s9(4) value 3.
> 12 x4 pic s9(4) value 4.
> 12 x5 pic s9(4) value 5.
> 12 x6 pic s9(4) value 6.
> 12 x7 pic s9(4) value 7.
> 12 x8 pic s9(4) value 8.
> 12 x9 pic s9(4) value 9.
> 12 xA pic s9(4) value 10.
> 12 xB pic s9(4) value 11.
> 12 xC pic s9(4) value 12.
> 12 xD pic s9(4) value 13.
> 12 xE pic s9(4) value 14.
> 12 xF pic s9(4) value 15.
>01 filler redefines hv.
> 12 hexvalues pic s9(4) comp occurs 16
> indexed by hv-x1.
>

>get-binary section.
>gb000.
> set hc-x1 to 1
> search hexchars
> at end
> *> the HEX, isn't... drastic action needed...not shown here
> when current-byte = hexchars (hc-x1)
> set hv-x1 to hc-x1 *> you might need to adjust this on
>MicroFocus
> move hexvalue (hv-x1) to bin-work
> end-search

There is a much simpler way to convert hex digits:

INSPECT current-byte CONVERTING
'0123456789ABCDEF' TO
X'000102030405060708090A0B0C0D0E0F'

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Prev: Cobol's File System Vs. RDBMs...
Next: COBOL NUMVAL issues