From: Richard Maine on
Bart Vandewoestyne <MyFirstName.MyLastName(a)telenet.be> wrote:

> On 2008-04-09, Richard Maine <nospam(a)see.signature> wrote:

> > What you want is a reopen on the same unit, changing the recl value.
....
> I'm afraid that I don't understand what you mean by 'a reopen on
> the same unit'. I tried this
>
> open(unit=5, action="read", status="old", position="rewind", recl=10000)
....
> Attempt to open a file that is already connected (open)

Hmm. That should be about right. But I never actually use the reopen
stuff myself. Changing things for preconnected units (like standard
input) is just about the only context where I could imagine doing so.
Let's see...(reads part of f2003 standard)...

Ah. I see a few potential problems here.

The smallest one is that (from f2003) "If the POSITION= specifier
appears in such an OPEN statement, the value shall not disagree with the
current position of the file." That makes your position='"rewind"
potentialy problematic. Best just to omit it as it doesn't actually do
anything useful. In a reopen, it doesn't cause the file to rewind; the
file had just better already be rewound or it makes the code illegal -
and not in a way that the compiler is required to diagnose, so darned if
I can see any use in specifying it.

I might omit the action='read' for the same kind of reason.

But the bigger problem is that "only the specifiers for changeable modes
(9.4.1) may have values different from those currently in effect." /me
mutters to self "I knew that", but I forgot that recl isn't a changeable
mode. So you can use the reopen trick for other things (such as comma
and blank interpretation modes), but not for recl. Sorry for the bum
steer.

> I also tried adding
>
> close(unit=5)
>
> before the open statement, but that also didn't help.

Yep. That would be one I might try next, as you then aren't doing a
reopen. but that gets deeper into problem areas. First, you would then
have to specify (with file=) what file you are connecting to unit 5. For
the reopen, you don't have to do that (and are better off if you don't),
as you are just keeping the same file already connected. But if it isn't
a reopen (because the unit isn't connected), then you need to specify
standard input by file name. The appropriate file name (if any) is
compiler and system dependent.

Second, it is plausible that the compiler might not allow you to close
unit 5 at all. That is also compiler dependent. The standard allows such
restrictions. (In fact, the standard allows compilers to have pretty
much any kind of restrictions on I/O; there's a huge "loophole" to allow
pretty much anything in the way of restrictions.)

--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain
From: Ron Shepard on
Is there some standard-conforming way to read a long standard input
record in blocks into a character string with advance='no'? You
would have to piece things together if a value gets broken into two
different reads. But I'm wondering if this is a way to allow you to
read arbitrarily long records in smaller pieces.

$.02 -Ron Shepard
From: Richard Maine on
Ron Shepard <ron-shepard(a)NOSPAM.comcast.net> wrote:

> Is there some standard-conforming way to read a long standard input
> record in blocks into a character string with advance='no'? You
> would have to piece things together if a value gets broken into two
> different reads. But I'm wondering if this is a way to allow you to
> read arbitrarily long records in smaller pieces.

No. Advance='no' has nothing to do with record length limits, which was
the subject at hand. Nonadvancing I/O is still record-based I/O, so any
limits on record sizes still apply. It is actyally important to
understand that just because nonadvancing I/O lets you read a partial
record, that doesn't mean that it isn't record based. Advance='no' can
address the issue of not knowing how large to dimension an input array,
or how long to make an input string. But it has nothing to do with
record size limits.

Now stream I/O avoids record length issues; it isn't record based. But
standard input isn't a stream file... at least not by default. You might
possibly be able to close it and then open it as a stream file, but
that's back to the issue of closing and opening standard input, with all
the possible limitations, plus the extra one of whether the system
allows it to be connected as a stream file.

By the way, is someone running into such record length limits with a
current compiler? There certainly used to be smallish default record
lengths with several compilers, but I was under the impression that
many/most/well-at-least-some of the vendors got tired of user complaints
on the subject and made the default large enough to not be much of an
issue (I'm thinking limits like 2 billionish bytes).

--
Richard Maine | Good judgement comes from experience;
email: last name at domain . net | experience comes from bad judgement.
domain: summertriangle | -- Mark Twain
From: James Giles on
Richard Maine wrote:
> Ron Shepard <ron-shepard(a)NOSPAM.comcast.net> wrote:
>
>> Is there some standard-conforming way to read a long standard input
>> record in blocks into a character string with advance='no'? You
>> would have to piece things together if a value gets broken into two
>> different reads. But I'm wondering if this is a way to allow you to
>> read arbitrarily long records in smaller pieces.
....
> Now stream I/O avoids record length issues; [...

Not really. The RECL specified on the OPEN statement is not
permitted for stream I/O, but that doesn't mean there's no limit.
The reason that formatted I/O even needs limits is because
Fortran's formatted I/O permits tab specifiers that tab backwards.
Hence the data may be processed a second time. That means the
I/O support library has to buffer the data. The limit is to allow
such buffering without consuming all your machine's memory.

Note that stream I/O also permits tabs. The library probably still
imposes a limit on how large the buffer can be. And therefore it
*may* still have a limit on record size.

Now, for neither stream nor non-stream I/O is there really a
reason for a limit to the whole record's size. The only real
limit is how much data can be processed by each I/O statement.
In a non-advancing I/O sequence, you aren't allowed to tab to
the left of the record's position at the beginning of each I/O
statement's execution. So, the buffer size only limits such partial
records. There is no practical reason both stream and non-stream
I/O can't do arbitrary record lengths by using non-advancing I/O.
And, as I've shown here, there *is* a practical reason that even
stream I/O will apply a limit to *partial* record lengths.

If a given implementation fails to allow arbitrary record lengths
for either stream or non-stream, that's an arbitrary choice by the
implementation.

Of course, for non-stream I/O you still have to open the file with
a large enough RECL to accomodate the longest record you expect
since that limit applies to the whole record and isn't really the
buffer size anyway. The standard should have been altered to
state that RECL applies to partial record lengths if non-advancing
I/O is used. It should have said so clear back when non-advancing
I/O was first introduced. :-(

> ... stream I/O ...] it isn't record based. [...

Actually it is. It applies the same record structure to files as
non-stream formatted I/O does. On input it passes the information
about where records end differently. On output it decides where
to end the output records differently. But it still treats formatted
files as sequences of records.

*Unformatted* stream I/O isn't record based. But *formatted*
stream I/O still is. Would that it weren't: you could use stream
I/O to directly process text files created on other systems than
you're running on. And, you could directly create files for use
on those other systems. (You can still do that, of course. You
just have to use *unformatted* streams to actually transfer the
data and use internal I/O to do the format conversion. Instead
of doing both in the same I/O statement.)

--
J. Giles

"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare


From: feenberg on
On Apr 9, 11:45 am, Bart Vandewoestyne
<MyFirstName.MyLastN...(a)telenet.be> wrote:
> A student of mine asked an interesting question today... and i
> didn't really know how to answer it, so here we go...
>
> Suppose you have a file test.txt containing lines of integer or
> floating point numbers. Suppose the number of columns of that file is
> larger than than the default that your compiler can handle. Then the RECL
> specifier can help, something like
>
> integer, dimension(1000) :: A
>
> open(unit=10, file="test.txt", action="read", &
> status="old", position="rewind", recl=10000)
> read(unit=10, fmt=*) A
>
> But what if you want to read this same file using standard input? Say
> you want to do something like
>
> ./program < test.txt
>
> with the same test.txt that has more columns than allowed by the default
> of the compiler.
>
> Apparently, something like
>
> read(unit=*, fmt=*) A
>
> does not work here, and the RECL does not seem to exist for a read
> statement...
>
> What's the solution here?
>
> Thanks,
> Bart

Perhaps all compilers don't do this, but the f77 compiler
that comes standard in most Linux distributions does. I
just created a file of integers from 1 to 100000 and removed the
newlines with the "tr" command. If I run "wc" it confirms the lack of
line delimiters:

wc longline2.txt
0 100000 588895 longline2.txt

Now I run an F77 program to read that from standard in:

dimension a(100000)
read(*,*) a
write(*,*) a(1),a(100000)
stop
end

And here is the result:

nber4.nber.org%> ./a.out <longline2.txt
1. 100000.

So it seems that with free-format (*) input, it is possible
to read a very long line into a vector with at least one compiler. I
also think that the SUN and f2c compilers have the same result, but
haven't tested them just now. Or is the default maximum longer than
588,895 characters?

Daniel Feenberg