From: Steve Lionel on
On 4/29/2010 11:46 PM, steve wrote:

> In the end, I agree with you that Om needs to talk
> to Intel developers.

He is - in our user forum - but since he has just given us a paraphrase
of the code and not a testable application, so far he's gotten much the
same advice as has been given here.

However, I took the paraphrase and managed to construct a testable
program with it and I do see a problem. The key seems to be the use of
array elements - scalar variables appear to work. I'll take this up with
the development team.

--
Steve Lionel
Developer Products Division
Intel Corporation
Nashua, NH

For email address, replace "invalid" with "com"

User communities for Intel Software Development Products
http://software.intel.com/en-us/forums/
Intel Software Development Products Support
http://software.intel.com/sites/support/
My Fortran blog
http://www.intel.com/software/drfortran
From: Ron Shepard on
In article <hregk7$aqi$1(a)news.eternal-september.org>,
mecej4 <mecej4_no_spam(a)operamail.com> wrote:

> Some months back, I had a program which did something of this sort:
>
> real, volatile :: t1,t2
> ...
> call cpu_time(t1)
> ... do some heavy processing in subroutines...
> call cpu_time(t2)
> ...
> write(*,*)t2-t1
>
> and found one compiler's optimizer being smart enough to observe that t1
> was not referenced in the intermediate code, and my program always ran
> in ZERO time, never mind the "volatile" attribute !

The t1 and t2 variables are local, so I don't think volatile would
have any effect on this. Maybe I'm wrong, but usually volatile
variables are in special common blocks or in shared memory or
something, not local variables.

I don't know why you would get ZERO time in this program. Perhaps
the code was so fast that it occurred within the same clock tick,
but you say "heavy processing" so that's probably not the case.

$.02 -Ron Shepard
From: glen herrmannsfeldt on
Jim Xia <jimxia(a)hotmail.com> wrote:

>> It maybe never says "clearly" what volatile means, but it does say

>> "NOTE 5.21
>> The Fortran processor should use the most recent definition of a
>> volatile object when a value is required. Likewise, it should make the
>> most recent Fortran definition available...."

> A note is just a note. You should know better :-)

>> and in normative text it says

>> "The VOLATILE attribute specifies that an object may be referenced,
>> defined, or become undefined, by means not specified by the program."

> Right. But what exactly does that sentence mean?

Another thought, given that this is regarding a loop, and that
optimizing compilers like to optimize loops and, especially,
move code out of loops when possible. Other parts of the loop
with variables that are not VOLATILE may get optimized in
non-obvious ways, such that the VOLATILE parts don't do what
one would expect. It might be that all variables in such
loops should be VOLATILE to guard against such effects.

>> I thought everybody understood that the processor should do the right
>> thing and never keep a volatile variable in a register for "very
>> long." ?Common value preserving optimizations are basically forbidden
>> in a "good faith bug free" implementation of volatile.

> I doubt. I didn't do a research on this, but I suspect there are many
> so-called "bugs" in C or C++ compilers on volatiles. Anyway, Om
> should argue with Intel compiler developers on this to see what they
> say.

Well, one could look at the generated assembly code to see if
there is anything obvious.

>> If V is a volatile variable, then an expression like
>> V + 3.14 + V
>> should have 2 loads for V. ?What isn't specified is how many
>> fracto-seconds there are between the loads.

OK, but say it is:

W = V + 3.14 + V

and W is not VOLATILE. Is there any reason the compiler can't do
the two loads for V, store the result in W, and then use W many
times? Even move the whole statement outside a loop?

> That expression is completely different from "do while (i(2) == 0))".

>> There's no way to specify in syntax or semantics what volatile
>> is supposed to do. ?The standard doesn't even require a processor
>> to have memory; it can't specify the timing of memory references.
>> But, what could the intent be other than to require the processor
>> to refresh the values whenever they are referenced?

> Yes and no. Can you tell me exactly how many times i(2) is
> referenced in "do while(i(2) == 0))"?

More specifically, the task switch points are not specified.

This reminds me of a program that I was working with (but didn't
write) some years ago. It used three threads, which were mostly:

1) Read in data and do some preprocessing.
2) Process data through special purpose hardware.
3) Post process results and write them out.

The system did all that, and used buffers in between such that
the threads could run at different rates. In my case, I would pipe
the output through compress before writing it to disk.

The program would run out of memory (exceed 2GB) while processing
input data and generating results that were much smaller than 2GB,
so the first suspect was a memory leak. It turned out that there
was no memory leak.

The first thread would read in and preprocess data, buffering it
in memory as fast as it could. The second thread would buffer its
results in memory to be processed by the third thread.

In my case, but not in the cases used for testing, it would do
that fast enough to fill up memory with processed data.

Relying on the OS to schedule threads can result in the case where
each thread does exactly what is expected, but the result is not
what is expected.

>> I think it's like I/O. ?That also is loosely specified (a processor
>> doesn't have to do any and if it does it's all processor dependent; what
>> could be less specified?) ?Yet I/O works pretty prortably because
>> everybody understands it and wants it to work.

> That's an interesting analogue I haven't though of.
> I'll think about it :-)

Especially since I/O often runs interrupt driven in its own thread/task.

-- glen
From: mecej4 on
On 4/30/2010 12:06 PM, Ron Shepard wrote:
> In article<hregk7$aqi$1(a)news.eternal-september.org>,
> mecej4<mecej4_no_spam(a)operamail.com> wrote:
>
>> Some months back, I had a program which did something of this sort:
>>
>> real, volatile :: t1,t2
>> ...
>> call cpu_time(t1)
>> ... do some heavy processing in subroutines...
>> call cpu_time(t2)
>> ...
>> write(*,*)t2-t1
>>
>> and found one compiler's optimizer being smart enough to observe that t1
>> was not referenced in the intermediate code, and my program always ran
>> in ZERO time, never mind the "volatile" attribute !
>
> The t1 and t2 variables are local, so I don't think volatile would
> have any effect on this. Maybe I'm wrong, but usually volatile
> variables are in special common blocks or in shared memory or
> something, not local variables.

Why restrict "volatile" to changes to the variable caused in other
routines or computer processes?

"Volatile" should include all causes of the value of a variable
changing, in ways that the processor cannot always comprehend, such as
the ticking of a clock, the calling of an RNG, I/O, unplugging the
computer, etc.

In other words, "volatile" is one way for the programmer to tell the
compiler, "don't try to be smart, just run the code in the sequence
written".

>
> I don't know why you would get ZERO time in this program. Perhaps
> the code was so fast that it occurred within the same clock tick,
> but you say "heavy processing" so that's probably not the case.

Zero is predictable if the optimizer is used and "volatile" not used,
unless the compiler has the special knowledge that CPU_TIME is a
non-deterministic subroutine.

In the case that I cited, the compiler writers allowed VOLATILE to be
run through the compiler without complaint, but as a NOP. They fixed the
problem in a later version. Until they did so, I kept the timing calls
in a separate subroutine which I compiled with -O0 .

> $.02 -Ron Shepard

-- mecej4
From: glen herrmannsfeldt on
mecej4 <mecej4_no_spam(a)operamail.com> wrote:
(big snip)

> Now, that's where a problem could arise, even though I may be
> doing some volatile nit-picking here.

> "Most recent" by whose clock? Some underground atomic clock,
> the system clock, or the individual process clock?

> Or a syllogistic sequence based on a "reasonable" interpretation
> of the source code?
(snip)

> Some months back, I had a program which did something of this sort:

> real, volatile :: t1,t2
> ...
> call cpu_time(t1)
> ... do some heavy processing in subroutines...
> call cpu_time(t2)
> ...
> write(*,*)t2-t1

> and found one compiler's optimizer being smart enough to
> observe that t1 was not referenced in the intermediate code,
> and my program always ran in ZERO time, never mind the
> "volatile" attribute !

I think this agrees with what I wrote in a previous post. Note
that the compiler (presumably) did separate loads of t1 and t2,
as instructed. If the variables in (do some heavy processing)
were not VOLATILE, then the compiler is free to move them,
and any access to them, around.

For similar reasons, I often see zero time reported by ftp
for transfering small files. Even when the network and/or
disk is slow enough compared to the time resolution that it
couldn't be that fast. At some point the data is moved to
the TCP buffers, and the time points are such that they
don't include the open/close processing.

-- glen
enough, and