From: glen herrmannsfeldt on
Richard Maine <nospam(a)see.signature> wrote:
(snip)

> I repeat my recommendation to use the facilities defined in the
> standard. If CPU time can be precisely defined and measured on a system,
> and the vendor's implementation of the CPU_TIME intrinsic doesn't get
> it, then I suggest submitting a bug report to the vendor. If you can get
> something adequate in some other way, then there is no reason the
> compiler vendor couldn't do the same.

In testing with gfortran/Scientific Linux, it seems that the increment
of CPU_TIME is 4ms. That is a little better than the timing for some
olympic ski events, but I wouldn't call it "precisely measured"
for computer time.

-- glen
From: Gordon Sande on
On 2010-03-10 13:17:59 -0400, Dave Allured <nospom(a)nospom.com> said:

> Gordon Sande wrote:
>>
>> On 2010-03-09 15:13:48 -0400, Arjan <arjan.van.dijk(a)rivm.nl> said:
>>
>>> Until now I monitor the performance of my application by measuring the
>>> real time spent by my program and subtract the value from the former
>>> iteration from the latest estimate. This gives me the number of
>>> seconds per iteration of my process. I have only 1 CPU, so the
>>> available time is distributed over all processes. My current
>>> application uses a lot of CPU and produces only a tiny bit of output,
>>> so I/O-time is not restrictive. How can I measure the net cpu-time
>>> spent by my program per iteration of my calculation, i.e. corrected
>>> for the fraction of CPU assigned to the process?
>>>
>>> A.
>>
>> Isn't "cpu_time" intended to give your the cpu time for your process?
>> This assumes that the system is capable of keeping track of the time used
>> by each process. Noted as new so must be F95.
>>
>> Real time is usually understood to be wall clock as given by "date_and_time".
>> There is also "system_clock" which gives the processor clock in processor units
>> but is still a real time clock. Both part of F90.
>
> My reading of the F95 standard for cpu_time is (1) it probably returns
> values related to real time, i.e. wall clock; and (2) the definition is
> so vague that you can't be sure what is really being measured. Arjan is
> seeking a measure of process time, not elapsed time or real time; so
> cpu_time is not adequate, especially on a busy computer. By "process
> time" I mean processor time spent in the user's actual code, not in
> unrelated code on a multiprocessing system. This is a very reasonable
> request.

Within the usual vagueness of standards they sure go a long way out of their
way to make it sound like cpu time as measured by a reasonable operating
system for that process. It is hard to believe that a competent vendor would
get it wrong. If was a bad as you seem to think I expect it would have out
through the interpretation process to make it more useful.

> You might look at the nonstandard etime function in gfortran. It
> advertizes to return time in two components, "user time" and "system
> time", neither term defined. If you are lucky then "user time" may have
> some relationship to process time on your system.
>
> http://gcc.gnu.org/onlinedocs/gfortran/ETIME.html
>
> Apparently other compilers also have etime, but with varying definitions
> of what is being measured. For more on this topic, including some
> ranting about the vagueness of cpu_time, see this:
>
> http://www.megasolutions.net/fortran/How-to-measure-the-time-a-subroutine-takes_-49933.aspx

--Dave


>

From: Ron Shepard on
In article <hn8s12$7qs$1(a)naig.caltech.edu>,
glen herrmannsfeldt <gah(a)ugcs.caltech.edu> wrote:

> In testing with gfortran/Scientific Linux, it seems that the increment
> of CPU_TIME is 4ms. That is a little better than the timing for some
> olympic ski events, but I wouldn't call it "precisely measured"
> for computer time.

There is usually a tradeoff between resolution and the time it takes for
the counter to wrap around. On computers with 32-bit clock register and
microsecond resolution, the wraparound occurs about every 15 minutes,
which means that there is a good chance for it to occur while timing a
sequence of events.

I've tried to devise my own timer that looks at time_and_date along with
a high-resolution timer to return accurate times for arbitrarily long
events, but it is difficult because then the timer itself begins to take
a significant, and sometimes widely varying, number of cycles. The only
real solution is for the hardware to catch up to the demands and to use
a 64-bit clock counter. Any other solution seems to be problematic.

There is always some arbitrariness associated with CPU time. Should the
CPU time include the time it takes to swap your job in and out to disk?
There are good arguments both pro and con. In a parallel environment,
should the CPU time include idle time for one node that is waiting to
synchronize with another node? Again, there are good arguments both pro
and con.

$.02 -Ron Shepard
From: glen herrmannsfeldt on
In article <ron-shepard-C5CA23.14572810032010(a)news60.forteinc.com> Ron wrote:
> In article <hn8s12$7qs$1(a)naig.caltech.edu>, I wrote:

>> In testing with gfortran/Scientific Linux, it seems that the increment
>> of CPU_TIME is 4ms. That is a little better than the timing for some
>> olympic ski events, but I wouldn't call it "precisely measured"
>> for computer time.

> There is usually a tradeoff between resolution and the time it takes for
> the counter to wrap around. On computers with 32-bit clock register and
> microsecond resolution, the wraparound occurs about every 15 minutes,
> which means that there is a good chance for it to occur while timing a
> sequence of events.

Even worse, CPU_TIME returns a REAL, so about 24 bits on most systems.
I don't see anything about different KINDs, but maybe.

> I've tried to devise my own timer that looks at time_and_date along with
> a high-resolution timer to return accurate times for arbitrarily long
> events, but it is difficult because then the timer itself begins to take
> a significant, and sometimes widely varying, number of cycles. The only
> real solution is for the hardware to catch up to the demands and to use
> a 64-bit clock counter. Any other solution seems to be problematic.

IBM has used 64 bits since S/370, but it doesn't seem to have
caught on, other than for things like RDTSC. The S/370 clock
counts 1MHz on bit 51, leaving some bits for faster clocks in the
future. They also use the low bits to distinguish processors on
multi-processor systems.

> There is always some arbitrariness associated with CPU time. Should the
> CPU time include the time it takes to swap your job in and out to disk?
> There are good arguments both pro and con. In a parallel environment,
> should the CPU time include idle time for one node that is waiting to
> synchronize with another node? Again, there are good arguments both pro
> and con.

For profiling, usually I wouldn't want to include those, but for
other purposes, I might.

-- glen
From: Richard Maine on
glen herrmannsfeldt <gah(a)ugcs.caltech.edu> wrote:

> Even worse, CPU_TIME returns a REAL, so about 24 bits on most systems.
> I don't see anything about different KINDs, but maybe.

No "maybe" about it. You are right that it doesn't say anything about
kinds. But that means exactly the opposite of what you seem to think.
That means it is required to work with all real kinds rather than that
it is restricted to one of them.

That's the way that *ALL* the intrinsics are described. Were you under
the impression that TAN, for one example, only supported singleprecision
because it says nothing about kinds?

--
Richard Maine | Good judgment comes from experience;
email: last name at domain . net | experience comes from bad judgment.
domain: summertriangle | -- Mark Twain