Another class of "DO10I=" bug [Fortran]

Prev: select specified columns in ch. array
Next: Format with implied do loop

From: glen herrmannsfeldt on 28 Jan 2010 15:16

Carlie Coats <carlie(a)jyarborough.com> wrote:
> Arjen Markus wrote:
> [snip...]
>> That said, yes, array assignments can be visually misleading.
>> I prefer to emphasize them by putting empty lines around
>> them, making them stand out a bit. (But I am not sure I do
>> that consistently ;)).

> And what I'm saying is: that's not good enough. One can not
> visually tell the following apart; only (4) and mismatched-shape
> versions of (2) are syntax errors, and (3) is a dangerous implicit
> type conversion which requires (probably off-screen) declaration
> info to realize it's happening:

> (1) <scalar> = <scalar>
> (2) <array> = <array>
> (3) <array> = <scalar>
> (4) <scalar> = <array>

It is, at least, consistent with mathematical notation.

Well, mathematics often uses a different type font for arrays
(or at least matrices) such that it is visually identifiable.

There are a large number of cases in Fortran where you can make
mistakes that aren't visually identifiable. This isn't likely
at the top of the list. Also, it isn't type conversion but
rank conversion, though implicit type conversion (also called
mixed mode arithmetic) has been part of Fortran since before it
was standardized.

In any case, it seems to be too late to change. (Like some
others that I have suggested.)

-- glen

From: Ian Harvey on 28 Jan 2010 16:36

On 29/01/2010 7:13 AM, Carlie Coats wrote:
....

> The most recent one looked somewhat like the following:
>
> MODULE M
> ...
> INTEGER, ALLOCATABLE :: KQ( :,: )
> ...
> SUBROUTINE S( M, N, <etc> ) ! 1500 lines later
> ...
> INTEGER K
> ...
> DO R = 1, M
> DO C = 1, N
> ...
> KQ = K !! should have been "KQ(C,R) = K
> ...

I appreciate that this is only a discussion about possibilities lost,
but obviously "shape" is already a fortran intrinsic that goes "the
other way" from your suggested role. Overloading it to go both ways
would almost certainly result in confusion. I'll use `TILE' below for
clarity, I guess you could nest SPREAD calls to do this as well.

I don't have any issue with the existing syntax. The lack of the
subscript list is reasonably obvious, and easy enough to search for once
you suspect that this type of problem exists. Having scalars be
conformable to any array also seems like a "natural" rule to me.

The ability to distribute scalars across arrays without additional (and
redundant) syntax is pretty useful, especially in the context of things
like arguments to elemental functions.

CALL elemental_op(array_r2, &
TILE(scalar,SIZE(array_r2,1),SIZE(array_r2,2)))

vs now:

CALL elemental_op(array_r2,scalar)

No thanks. Consider the cases when the shape of the destination array
is not known at compile time, and then one day you type this:

CALL elemental_op(array_rank2, &
TILE(scalar,SIZE(array_r2,1),SIZE(array_r2,1)))

I could see that causing some hard to diagnose runtime grief!

From: Paul van Delst on 28 Jan 2010 17:08

Carlie Coats wrote:
>
> The most recent one looked somewhat like the following:
>
> MODULE M
> ...
> INTEGER, ALLOCATABLE :: KQ( :,: )
> ...
> SUBROUTINE S( M, N, <etc> ) ! 1500 lines later
> ...
> INTEGER K
> ...
> DO R = 1, M
> DO C = 1, N
> ...
> KQ = K !! should have been "KQ(C,R) = K
> ...
>
> Sloppy omission/error, just like "DO10I=1.10" was...

True - it is a sloppy error - but if the above snippet is really representative of the
code in question, the module source itself is pretty sloppy. E.g.
+ For a module-wide variable (that contains 1000's of lines of code) "KQ" is a terrible
terrible choice of variable name.
+ If the module variable is allocatable, why not make it like an "object", i.e. wrap it
inside a derived type (defined in it's own module) and access it via its own methods.
etc..

Just based on the snippet above, I would expect the module in question to be full of bugs.
It wouldn't pass a review, and thus wouldn't be accepted for implementation until the
programmer who wrote it fixed it.

cheers,

paulv

>
> -- Carlie
>
>
>

From: stevenb on 28 Jan 2010 19:02

On Jan 28, 5:07 pm, Gordon Sande <g.sa...(a)worldnet.att.net> wrote:
> Isn't this exactly what FTNCHEK type tools are intended to do?
> The problem is that FTNCHECK is for F77 and if it were for F90
> you would have to wade through a lot of advisories for OK things.

One could probably implement a FTNCHECK-like tool as a plugin for
gfortran. Just a thought. I don't think anyone has tried this already.

Ciao!
Steven

From: Carlie Coats on 31 Jan 2010 09:05

Paul van Delst wrote:
[snip...]
> I have to agree with you here. In my world, these sorts of
> problems are more due to lack of experience/know-how on
> the part of the programmer. What % of scientists/engineers/etc
> that write Fortran90/95/2003 code today have had formal
> training in a) Fortran90/95/2003 or b) software design/
> construction? I bet the magnitude of the latter category
> is larger than the first.
>
> One way I have tried to combat these errors from occurring
> is to encourage people to write short procedures rather than
> the more usual monolithic ones in which it is very easy to
> lose the context. But, it's pretty hard to break the
> "everything and the kitchen sink" > type of programming
> habit (me included).

Gordon Sande wrote:
[snip...]
> Or being distracted after one has figured which array but
> before one has figured out exactly what subscipt to apply...

Training and mis-training for scientists and engineers
*is* a big problem. Among meteorologists, for example,
it seems that many of them have been encouraged to get
loop-order vs subscript order exactly backwards. And man
other evil practices.

Distraction is also a major problem. Getting time uninterrupted
can be almost impossible for some people at some offices (I think
that *is* what happened in this last case). And their work shows it.
But the PHBs don't believe you when you complain.

So is complexity, and managing it properly.

For environmental models, the problem frequently is manipulating
the programmatic representation of a complex state. The way
I prefer to do this is to implement each sub-model as a large
MODULE containing its state as PRIVATE variables, CONTAINing
only PUBLIC routines necessary to input, manipulate, and output
that state (which may further call upon otherPRIVATE routines).
For a land surface model, this can lead to 3000-line modules
mostly consisting of a number of subroutines.

This seems to me far better than the competing ideas:

(1) the whole model is a vast conspiracy to manipulate
that state, which is contained either in a set of COMMONs
or as PUBLIC variables in modules whose only purpose is
to host those variables (this is the most-common strategy,
as in MM5 or SMOKE);

(2) Everything is a subroutine argument (this holds second
place) (frankly, I don't find subroutines with 50 or more
arguments maintainable, as in the NOAH LSM);

(3) The entire model state is a derived-type variable,
with hundreds of fields, and which is INTENT(INOUT) for
all the subroutines (as in the solver part of WRF);

(4) The subroutines are small, but the call trees are
*very* deep: in order to understand the I/O structure
of WRF, you need to understand at one time 14 levels of
a 17-level call tree, for which two of the calls are
indirect, to a subroutine passed to its caller as an
argument: the entire I/O structure is hard-coded into
this call-tree.

Then again, the software-engineering literature suggests
that "moderate"-sized routines of typically 200-500 lines
minimize the bug-count.

FWIW -- Carlie

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: select specified columns in ch. array
Next: Format with implied do loop