From: Richard Maine on
monir <monirg(a)mondenet.com> wrote:

> Do I have to specify the stack size in the command; if it is the
> problem ??

Nobody has even vaguely suggested that stack size might be related to
the problem. The reported symptoms don't hint at anything of the sort.
Glen did mention the posibility of things being "wrong on the stack." I
suppose you might have misinterpreted that; it has nothing to do with
stack size. As they say, "size isn't everything." :-)

See baf's comments. I occasionally claim limitted clairvoyance, but not
as much as would be required to work with the data provided here.

--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain
From: aerogeek on
Yes as everyone else said declaration is important. Show that and we
might have some idea what the problem might be.

In the meantime, can you just remove that pause statement and add a
dummy print statement on that line.

Long back I too had a similar problem and it has this similar curious
effect. A statement not related to the code, was causing the wrong
answers. And if I remember it correctly, the culprit was a wrong
declaration!!

Cheers
From: mecej4 on
On 3/27/2010 11:53 PM, baf wrote:
> monir wrote:
>> On Mar 27, 10:17 pm, baf <b...(a)nowhere.net> wrote:
>>> monir wrote:
>>>> On Mar 27, 9:28 pm, steve <kar...(a)comcast.net> wrote:
>>>>> On Mar 27, 6:22 pm, monir <mon...(a)mondenet.com> wrote:
>>>>>> On Mar 27, 8:43 pm, nos...(a)see.signature (Richard Maine) wrote:
>>>>>>> glen herrmannsfeldt <g...(a)ugcs.caltech.edu> wrote:
>>>>>>>> monir <mon...(a)mondenet.com> wrote:
>>>>>> There're no optional arguments in the abbreviated sample code.
>>>>>> Please replace the representative 3 call polin2() with the actual
>>>>>> calls:
>>>>> steve wrote:
>>>>> I think you need to change line 12 in polin2().
>>
>> The program is successfully compiled with the command:
>> ..>g95 -fbounds-check -ftrace=full -o Test1 Test1.for
>>
> Successful compilation means nothing since these options only provide a
> clue of problems at run-time. In addition, these options only find a few
> of the zillion possible problems.

Indeed. The following program is also "successfully compiled".

program test
integer i,j
j=0
i=j
print *,i/j
end

It does not follow that it should run "successfully".

In fact, language standards should not be expected to stipulate what may
happen when incorrect/non-compliant code is compiled and executed.

One of the most difficult class of bugs to locate and fix is those which
surface only when debugging and/or error-checking are turned _off_ , and
which disappear when running under a source debugger.

>> Do I have to specify the stack size in the command; if it is the
>> problem ??

That could be the problem, so could a number of other errors.

>>
>> Regards.
>> Monir
>
> I guess you are not willing to completely read my earlier post. Only a
> few of the contributors to this NG are clairvoyant, and even they would
> have trouble guessing what your code looks like. Without declarations of
> the procedures involved, nobody can help you.

This thread reminds one of the patient who refuses to talk to the
doctor yet expects a cure.

-- mecej4
From: kiwanuka on
On Mar 28, 10:27 am, aerogeek <sukhbinder.si...(a)gmail.com> wrote:
> Yes as everyone else said declaration is important. Show that and we
> might have some idea what the problem might be.
>
> In the meantime, can you just remove that pause statement and add a
> dummy print statement on that line.
>
> Long back I too had a similar problem and it has this similar curious
> effect. A statement not related to the code, was causing the wrong
> answers. And if I remember it correctly, the culprit was a wrong
> declaration!!
>
> Cheers

In the past I've replaced pause statements with write statements (to
avoid warnings at compile time about pause statements and also to get
information about possible problems in a log file) without a problem.
I'm sure that doesn't add to the discussion but it might help the OP
to consider the possibility that the pause statement is not the
problem here.

Robert
From: monir on
On Mar 28, 5:27 am, aerogeek <sukhbinder.si...(a)gmail.com> wrote:
> Yes as everyone else said declaration is important. Show that and we
> might have some idea what the problem might be.
>
> In the meantime, can you just remove that pause statement and add a
> dummy print statement on that line.
>
> Long back I too had a similar problem and it has this similar curious
> effect. A statement not related to the code, was causing the wrong
> answers. And if I remember it correctly, the culprit was a wrong
> declaration!!
>

Good news and bad news!

1) Adding a dummy Print statement and removing the Pause statement
didn't solve the problem. Remember the problem is not the Pause
statement but its absence.
Many other attempts have failed in identifying the cause of the
problem. Trial & error approach has not worked in this case.

2) With:
- no command-line error messages
- no compiler/compilation error messages
- no warning error messages
- no run-time error messages
- no stack (overflow?) error message
- no math error messages
(there might be some overlapping in the above, but you get the point!)
or simply: no syntax errors, no compiler errors, no logical errors,
and no run-time error messages (assuming NaN doesn't necessarily
belong to any of the above):
Obviously the program it is NOT error-free!!
But where would one look to identify the cause of the problem in a
~22,000-line code with over 80 routines ??
If the size of the exe file is any indication, it is ~ 1.5 MB.

The question remains:
What makes a program works fine when it is temporarily suspended by a
Pause statement, but returns NaN w/o such code-unrelated statement ??

3) Here's again the abbreviated sample code for easy reference:
(F77, g95)

PROGRAM main
.....................
call dCpZeros()
.......................
End main
------------------------------------
SUBROUTINE dCpZeros()
......................
do i=1, 9
do j=1, 10
do k=1, 30
......................
call Polin2(a,b,c,d,x)
call Polin2(b,c,d,e,y)
call Polin2(c,d,e,f,z)
.....................
pause 'In Sub dCpZeros() 103'
.....................
print*,' x = ', x
print*,' y = ', y
print*,' z = ', z
end do
end do
end do
...................
Return
End Subroutine dCpZeros
------------------------------------
SUBROUTINE Polin2(w1, w2, w3, w4, val)
implicit none !see item 5
.....................
.....................
call UnStdy_Terms()
.....................
.....................
Return
End Subroutine Polin2
------------------------------------

4) Having thoroughly re-checked all the declarations and argument
lists throughout {incl. routine Polin2()}, couldn't find any
inconsistencies or obscure code violations. All variables appear to
be correctly declared/dimensioned.

5) Subroutine Polin2() had ALL its variables declared, but had NO
"Implicit None" statement.
Not expecting much at this point, but for consistency with the other
routines I added the "Implicit None" statement, with NO ADDITIONAL
declarations or any changes. None. Just simply inserted "Implicit
None" at the top of Polin2().
Deleted the Pause statement from the calling routine, saved, re-
compiled and ran.
Program works fine!!!!!!!!

6) Just to make sure that I wasn't seeing things, I deleted (not just
commented out!) the "Implicit None" in Sub Polin2(), still NO Pause
statement in dCpZeros(), saved, re-compiled and ran.
Program fails, returning NaN.
That was the good news!

7) The bad news is that the above fix treats the symptoms and doesn't
identify the problem.
It makes absolutely no sense that having either of the two code-
unrelated statements ("Pause" in the calling routine dCpZeros(), and/
or "Implicit None" in the called routine polin2()) would produce the
correct results, while omitting either or both would return NaN.
In other words, one or both of these statements MUST be in for the
program to work correctly.
It is more likely that the problem will pop-up again at some point.

8) Based on my rather limited knowledge of Fortran, here's a thought
for you experts to critique.
As indicated earlier, the code (work-in-progress, ~ 22,000 lines and ~
80 routines) is mostly in F77, but with some limited patches of F90,
e.g.; use of unlabeled loops, vectors & matrices & array operations,
some of the new intrinsic functions, one Contains and one explicit
Interface, but no modules, no dynamic arrays, no defined data types,
no Pointers, no ....
I've always had some suspicions about such programming practices, even
though the g95 compiler never complained. But it seems reasonable to
expect at some point (depending on the complexity of the code and the
extent of the mix) that there would be a conflict that wouldn't be
detected/resolved by the compiler, leading to possible confusion or
misinterpretation or memory disruption or whatever.

The "g95" compiler, or any other comparable compiler for that matter,
can't possibly detect and resolve each and every conflict that might
arise from a mixed F77+F90 programming. Correct ??
Just a thought! ... you don't have to take it seriously if you don't
want to!

9) Meanwhile, will continue testing the program (with "Implicit None"
in Sub polin2() and no Pause in Sub dCpZeros()).
Have already tested 14 runs so far, each run under a different
scenario invoking different sets of routines. A run takes ~ 45.0 min
on a 3.16 GHz m/c.
So far so good!
(Sorry for the lengthy post)

Regards.
Monir
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5
Prev: Problem with Matmul
Next: Calling DLL subroutine from C++