From: Robert on
On Thu, 24 Jul 2008 07:43:07 -0600, Howard Brazee <howard(a)brazee.net> wrote:

>On Wed, 23 Jul 2008 21:45:21 -0500, Robert <no(a)e.mail> wrote:
>
>>Most professionally developed Cobol systems abandoned indexed files and switched to
>>databases, starting in the 1980s. The ones still using indexed files tend to be amateurs
>>and small shops, who stayed with indexed because databases were expensive. That reason
>>disappeared in the late 1990s when full-featured free databases such as PostgreSQL and
>>MySQL became available.
>
>I'd like to see stats if you have them. But I want some detail on
>what a "CoBOL system" is. It's really hard to compare a shop with a
>CoBOL program running on a few dozen PCs, and a mainframe shop with
>thousands of programs being run.
>
>The last shop I worked at that wasn't primarily database centered was
>in 1980, but I haven't yet worked in a shop that has abandoned every
>one of its indexed files.

I worked at 13 Unix shops in the last 10 years. Only one had any indexed files, and it was
a tiny shop with one manager/programmer.

>>>They are also considered to be Closed systems. Accessing
>>>them by external clients or customers would require creating
>>>new programs to get into the data with the desired formats.
>>
>>There are ODBC drivers for proprietary Cobol file systems that allow any SQL tool to
>>access the data. The drivers are not free.
>>http://www.datamystic.com/datapipe/odbc_vendors.html
>>
>>There are free programs to translate a Cobol indexed file to CSV (comma delimited), which
>>can easily be loaded to a database or spreadsheet. I posted the source for one here. It
>>uses a copybook defining the Cobol file.
>
>It's very easy to write such a program, but I've only written them for
>files to be downloaded into spreadsheets - whenever I extract for
>databases, I've either used undelimited data or gone direct to the
>database.

You're talking about programs custom written to convert one file. The program I posted
converts any/all files. It interprets the copybook at execution time.
From: Robert on
On Thu, 24 Jul 2008 15:17:26 -0700 (PDT), softWare design <sabraham(a)baxglobal.com> wrote:

>On Jul 23, 6:28�pm, "Pete Dashwood"
><dashw...(a)removethis.enternet.co.nz> wrote:
>>
>> I agree with you �that the COBOL file system is "closed", inasmuch as you
>> need to write a COBOL (or other) program which knows the structure of it, in
>> order to access it. Certainly, some systems have ODBC drivers that permit
>> access but the basic problem is that, unlike a DBMS, the structure of the
>> data is not held in the file... (there is no metadata).
>>
>> This means that unless you can write a program or get one written, you can't
>> access it. (Compare this with an RDBMS, for example, where anyone who knows
>> SQL and has the right permissions can access the data.)
>>
>
>
>
>I understand that RDBMs offers greater visibility to the file
>contents, but how about performance considerations?
>
>Can a relational database file system slows down the Cobol
>application?

They can speed it up, because they can use parallel processing on the server side (even if
the server is the same box).

>Are there any performance bottlenecks when Cobol application
>handles large volume of data?

The bottleneck is that Cobol applications are single-threaded. To load large volumes with
Cobol we partition the database and run a Cobol process for each partition. I've run as
many as 1,000 parallel processes loading a single table.

>Will Cobol application take much longer to run if I/O operations
>access the RDBMs file system as if they were ISAM files?

An RDBMS running set operations can be DRAMATICALLY faster than a Cobol or other
procedural language running individual row operations. I recently rewrote a Cobol-like
PL/SQL program into non-procedural straight SQL. Run time went from eight days running 80
parallel processes to a half hour running one (client) process.

Non-trivial database operations don't run on a single table, they join tables. Joining
with SQL is MUCH faster than reading with cursors and joining with procedural logic.

If you MUST handle invidual rows in Cobol, for instance loading a table, you want a SQL
command to operate on a large array of rows, say 1,000. Issuing an INSERT for each row
kills performance. Doing so complicates error handling. If one row fails to insert, say
due to a foreign key constraint, the whole array fails and the server doesn't tell you
which row(s) are in error. In that case, you must insert them individually. You can, and
should, also fetch arrays of rows. If you think you need a cursor, you're still thinking
in Cobol.
From: Robert on
On Thu, 24 Jul 2008 12:37:33 -0700 (PDT), softWare design <sabraham(a)baxglobal.com> wrote:

>On Jul 24, 11:17�am, Howard Brazee <how...(a)brazee.net> wrote:
>>
>> This leads me to ask:
>> 1. � What do you man by columns in an index file?
>> 2. � What is the relationship between being open or closed and being
>> able to add columns?
>>
>
>
>Suppose I want to change my data model by adding a column
>to a table (field to a file record). With a relational database
>I add the column and the job is done. Any program that needs
>to work with that new column can on-the-fly, and any existing
>programs that do not care about the change stays untouched.

You can get the same flexibility with indexed files by adding a 'data layer' program
between the application and the file(s). The application asks for a 'logical view'
containing the columns or version number it was written for.

Such a layer allows you to redesign the file system or even port it to database without
touching the applications. You can, for example, split what used to be one horribly
unnormalized file into several.

>The Cobol index file system is considered to be Closed simply
>because it requires a knowledge of the file definition/structure,
>in order to access the stored data. An open file system should
>allow privileged users to access the data and generate queries
>on-the-fly in the desired format they need, and facilitates any
>column insertions without the need to write special programs.

Alternatively, you can use an ODBC driver to access the indexed file, which provides
everything on your wish list without touching the files.


From: Robert on
On Thu, 24 Jul 2008 12:37:33 -0700 (PDT), softWare design <sabraham(a)baxglobal.com> wrote:

>On Jul 24, 11:17�am, Howard Brazee <how...(a)brazee.net> wrote:
>>
>> This leads me to ask:
>> 1. � What do you man by columns in an index file?
>> 2. � What is the relationship between being open or closed and being
>> able to add columns?
>>
>
>
>Suppose I want to change my data model by adding a column
>to a table (field to a file record). With a relational database
>I add the column and the job is done. Any program that needs
>to work with that new column can on-the-fly, and any existing
>programs that do not care about the change stays untouched.

Alternatively, leave the original file unchanged and put the new column(s) in a another
file. Programs interested in the new data will have to join the two files, while old
programs need not be changed.
From: Robert on
On Thu, 24 Jul 2008 12:17:37 -0600, Howard Brazee <howard(a)brazee.net> wrote:

>I'd argue that databases are by design, more closed than files -
>because security and privacy needs have been integrated with their
>designs. Files (not "CoBOL files"), on the other hand, are just
>hunks of data open for anybody to use.

That's not true. Modern file systems have elaborate security controlling who can read,
update and see the existance of files. What they don't have, and databases do, is column
level security. You have to put protected columns in another file(s).