From: Gordon Sande on
On 2010-03-08 12:33:20 -0400, nospam(a)see.signature (Richard Maine) said:

> Gordon Sande <g.sande(a)worldnet.att.net> wrote:
>
>> On 2010-03-08 11:18:19 -0400, Ron Shepard
>> <ron-shepard(a)NOSPAM.comcast.net> said:
>
>>> Most compilers put the initialized values in static storage in the
>>> executable file. For a large array this means that the exe file is
>>> large, and it will take a long time to load from disk. The
>>> executable statement will execute many thousands of times faster
>>> than the equivalent disk i/o, and without the static storage the exe
>>> file will be smaller and will load faster.
>>
>> The issue of load time is both technically true and good practice. But
>> the question is the load time cost compared to the time to type a "return"
>> or "enter". The comparison in 1970 for a heavily loaded mini may not be
>> the same as that of multicore desktop with a 7200 RPM disk connected with
>> a SATA controller. One can enter into long technical discussions about the
>> effects of RAM caching that will confuse everyone including the discussers!
>>
>> My concern is not that it is useful commentary between experts over a beer
>> but that is misguided and misleading for beginners to treat it as a priority
>> concern.
>
> I dunno. I suspect it is still a real issue that can well hit even
> beginners - perhaps even more so than the experienced users. I have seen
> static initialization actually crash what was then a fair-sized server
> machine during compilation by running it out of disk space. That was
> circa mid 90's. Even when it didn't crash, there was a huge performance
> hit during compilation - as in differences between several tens of
> minutes versus probably about the same number of seconds. I'm not at all
> convinced that all that stuff is now irrelevant. I wonder about things
> like limits on exe file sizes in windows.

It depends on the defintion of beginner. ;-)

My mid 90s big disk was 540MB (it replaced a 40MB one!) which is the size of
my current video buffer and much smaller than the last disk I bought (1000GB
on an iMac). I do not know how to classify the guy who dimensions his first
array at 400GB as I would use a variety of adjectives with other than positive
connotations. I think that when paging becomes an issue I would drop
the beginner
description. Some of such folks may be experienced in other things so
"beginner"
is a tough term to justify. In the case of the OP of this thread I
expect beginner
is suitable although from other newsgroups I doubt if the OP is a freshman
doing an advanced topic.



From: glen herrmannsfeldt on
Ron Shepard <ron-shepard(a)nospam.comcast.net> wrote:
(snip)

> Most compilers put the initialized values in static storage in the
> executable file. For a large array this means that the exe file is
> large, and it will take a long time to load from disk. The
> executable statement will execute many thousands of times faster
> than the equivalent disk i/o, and without the static storage the exe
> file will be smaller and will load faster.

It seems that many now don't put arrays of zeros into the file,
but anything else they do. I have gfortran on Scientific Linux 5.3.

integer x(1000000)
data x/1000000*0/
print *,x(12345)
end

Results in a small object file and small executable file.
As far as I can tell, the system initializes static storage
to zero, and the compiler knows that.

integer x(1000000)
data x(12346)/1/
print *,x(12345)
end

In this case, the object file isn't big, as it can initialize many
zeros in one line, but the executable (a.out) file is big.
Unlike the standard requires, the other array elements are zero.

On the other hand, many systems use the executable file as backing
store for the virtual memory, with copy on write status. If it
isn't modified, the data is fetched from the file itself, instead
of from the swap file. On systems that do this, a running program
will crash if you recompile it (and write over the executable file)
while it is running.

-- glen
From: glen herrmannsfeldt on
Gordon Sande <g.sande(a)worldnet.att.net> wrote:
(snip)

> The issue of load time is both technically true and good practice. But
> the question is the load time cost compared to the time to type a "return"
> or "enter". The comparison in 1970 for a heavily loaded mini may not be
> the same as that of multicore desktop with a 7200 RPM disk connected with
> a SATA controller. One can enter into long technical discussions about the
> effects of RAM caching that will confuse everyone including the discussers!

integer x(100000000)
data x(12346)/1/
j=0
do i=1,100000000,3097
j=j+x(i)
enddo
print *,x(12345),j
end

First, time how long it takes to compile, then to run.

On my multicore system with SATA controller it takes about 12 seconds
to compile (mostly the write time for the a.out file), and 0.983s
to run. (If you run it more than once, it comes out of the
system file cache, and runs faster.) Without the DO loop,
it doesn't actually read the array off disk, so it will run fast.
On gfortran/scientific linux the array is initialized to zero
except for one element.

If instead I use:

data x/12345*0,1,99987654*0/

to initialize the array, then it generates one line of object file
for each array element, resulting in a much longer compile time.

-- glen
From: glen herrmannsfeldt on
Gordon Sande <g.sande(a)worldnet.att.net> wrote:
(snip)

> It depends on the defintion of beginner. ;-)

> My mid 90s big disk was 540MB (it replaced a 40MB one!)
> which is the size of my current video buffer and much smaller
> than the last disk I bought (1000GB on an iMac). I do not know
> how to classify the guy who dimensions his first array at 400GB
> as I would use a variety of adjectives with other than positive
> connotations. I think that when paging becomes an issue I would
> drop the beginner description. Some of such folks may be
> experienced in other things so "beginner" is a tough term
> to justify. In the case of the OP of this thread I expect beginner
> is suitable although from other newsgroups I doubt if the OP is
> a freshman doing an advanced topic.

There are many programs that can be written in the form

(read a line)
(process line)
(write out results)
(repeat above until done)

that beginners tend to write in the form

(read in whole file)
(process data)
(write out whole file)

As far as I remember, I didn't start out that way, but it
does seem to be way too common. Well, one of my earlier programs
printed really big letters on fanfold paper, banner style. It
filled up a 120 by 60 array for each letter, and then printed that,
instead of doing it line by line. That array had a fixed size,
not dependent on the size of the input file, though.

-- glen
From: Ron Shepard on
In article <1jf10rs.17m0giq1kc31qoN%nospam(a)see.signature>,
nospam(a)see.signature (Richard Maine) wrote:

> I dunno. I suspect it is still a real issue that can well hit even
> beginners - perhaps even more so than the experienced users. I have seen
> static initialization actually crash what was then a fair-sized server
> machine during compilation by running it out of disk space. That was
> circa mid 90's. Even when it didn't crash, there was a huge performance
> hit during compilation - as in differences between several tens of
> minutes versus probably about the same number of seconds. I'm not at all
> convinced that all that stuff is now irrelevant. I wonder about things
> like limits on exe file sizes in windows.

CPUs and memory speeds have increased by about a factor of 1000 from the
70's (or even the early 80's), while disk I/O speed has increased only a
factor of about 10 to 50. And if the file is stored remotely and
accessed over a network, you don't even get the full benefit from faster
disks. So when comparing what can be done in a sequence of executable
statements (using CPU and memory) to what can be done through I/O, I
would say that the tradeoffs are even more important now than they were
30 years ago, not less important.

I don't know when a programmer should begin to worry about these things.
Maybe a beginner should be concerned with other things first. But this,
or at least the general principle involved, is something that I would
think should be learned pretty quickly, particularly when large arrays
are being used.

$.02 -Ron Shepard