From: Anonymous on
In article <uq8jf3pd3rq48eqio0hdtqo172nv2c16is(a)4ax.com>,
Robert <no(a)e.mail> wrote:
>On Tue, 25 Sep 2007 22:45:12 +0000 (UTC), docdwarf(a)panix.com () wrote:
>
>>In article <regif3d0b34nreavsckap09omqjhptnik8(a)4ax.com>,
>>Robert <no(a)e.mail> wrote:
>>>On Tue, 25 Sep 2007 09:25:04 +0000 (UTC), docdwarf(a)panix.com () wrote:
>>>
>
>>>>Now, Mr Wagner... is one to expect another dreary series of repetitions
>>>>about how mainframers who said that indices were faster than subscripts
>>>>were, in fact, right about something?
>>>
>>>I expected I-told-you-so from the mainframe camp.
>>
>>It may be interesting to see if you get one; my point - and pardon the
>>obscure manner of its making - was that you made a series of repetitions
>>which a demonstration has disproved and it may be interesting to see if an
>>equally lengthy series of repetitions follows... or if it just Goes Away
>>until you next get an idea about something... and begin another, similar
>>series of repetitions.
>
>We saw that subscript and index run at the same speed on three CPU
>families -- HP PA
>(SuperDome), DEC Alpha (Cray) and Richard's undisclosed machine,
>possibly Intel.

I can barely speak for myself, Mr Wagner, let alone some 'we'... but I
recall seeing post after post were you indicated, rather pointedly, that
the speed superiority of indices over subscripts was something maintained
by mainframers and was, according to your test, an obsolete belief.

Results were then posted, purporting to be from a mainframe run, which
appeared to verify this obsolete belief.

>I am
>confident we'd see the same on Intel, PowerPC (pseries, iseries, Mac)
>and SPARC, based on
>tests I ran a few years ago. Thus the generalizaton. I was surprised to
>see zSeries did
>not follow the pattern of the others.

A wonderful world it is, Mr Wagner... and perhaps this inconsistency of
performance might work itsself into your own consistency of performance.
It might be an interesting exercise, saying, in the future, 'this-and-that
is quite obviously the case... but remember, when I said that-and-this was
the case I was, quite soundly and publicly, shown an example to the
contrary.'

>
>My previous idea, that memory alignment no longer matters, turned out to
>be wrong. It does
>matter on modern RISC machines.
>
>There's a good chance I'll get another idea.

That's to be be hoped for... and even more so that it will be shaped by
one's previous ideas, both proven right *and* proven wrong.

DD

From: Anonymous on
In article <5lu49rFa5hnvU1(a)mid.individual.net>,
Pete Dashwood <dashwood(a)removethis.enternet.co.nz> wrote:

[snip]

>A few days ago I
>was running a test on a P4 notebook that had to create a couple of million
>rows on an ACCESS database.

Why, Mr Dashwood... how interesting! Keep at it, you'll be up to sixty
million and change in no time!

DD
From: Judson McClendon on
"Pete Dashwood" <dashwood(a)removethis.enternet.co.nz> wrote:
>
> It is things like this that make me wonder why we even bother about performance and have heated discussions about things like
> indexes and subscripts, when the technology is advancing rapidly enough to simply
> take care of it.

Consider this. If Microsoft had put performance at a premium, Windows
would boot in 1 second, you could start any Office application and it
would be ready for input in the blink of an eye, and your Access test would
have run in a few seconds. How many thousand man-years have been spent
cumulatively all over the planet waiting on these things? :-)
--
Judson McClendon judmc(a)sunvaley0.com (remove zero)
Sun Valley Systems http://sunvaley.com
"For God so loved the world that He gave His only begotten Son, that
whoever believes in Him should not perish but have everlasting life."


From: Roger While on
I really, really tried to keep away from this subject but ...
One of the problems with the speed2 prog is the
attempt to deduce the perform cost.
Now OC produces exactly the C code that reflects
the statements eg.
/* speed2.cob:63: PERFORM */
{
for (n0 = ((int)COB_BSWAP_32(*(int *)(b_18 + 30))); n0 > 0; n0--)
{
{
/* speed2.cob:64: EXIT */
{
goto l_5;
}
}

/* EXIT PERFORM CYCLE 5: */
l_5:;
}
}

BUT gcc (in current versions) is far more
clever and deletes the whole thing :-)

Revised test prog -
(This should be compatible with most compilers)

identification division.
program-id. speed5.
data division.
working-storage section.
01 comp5-number comp-5 pic s9(09).
01 s-subscript binary pic s9(09).
01 repeat-factor value 900000000 comp-5 pic s9(09).
01 test-byte pic x(01).

01 misaligned-area.
05 array-element occurs 4096 indexed x-index.
10 misaligned-number comp-5 pic s9(09).
10 to-cause-misalignment pic x(01).
05 byte-element occurs 4096 indexed x-index-1 pic x.

procedure division.

initialize misaligned-area

display "Start prog " function current-date
set x-index to 1000
display "Index start " function current-date
perform repeat-factor times
if x-index = 1000
set x-index up by 1
else
set x-index down by 1
end-if
move array-element (x-index) to test-byte
end-perform
display "Index end " function current-date

move 1000 to s-subscript
display "COMP start " function current-date
perform repeat-factor times
if s-subscript = 1000
add 1 to s-subscript
else
subtract 1 from s-subscript
end-if
move array-element (s-subscript) to test-byte
end-perform
display "COMP end " function current-date

move 1000 to comp5-number
display "COMP-5 start " function current-date
perform repeat-factor times
if comp5-number = 1000
add 1 to comp5-number
else
subtract 1 from comp5-number
end-if
move array-element (comp5-number) to test-byte
end-perform
display "COMP-5 end " function current-date

set x-index-1 to 1000
display "Index start " function current-date
perform repeat-factor times
if x-index-1 = 1000
set x-index-1 up by 1
else
set x-index-1 down by 1
end-if
move byte-element (x-index-1) to test-byte
end-perform
display "Index end " function current-date

move 1000 to s-subscript
display "COMP start " function current-date
perform repeat-factor times
if s-subscript = 1000
add 1 to s-subscript
else
subtract 1 from s-subscript
end-if
move byte-element (s-subscript) to test-byte
end-perform
display "COMP end " function current-date

move 1000 to comp5-number
display "COMP-5 start " function current-date
perform repeat-factor times
if comp5-number = 1000
add 1 to comp5-number
else
subtract 1 from comp5-number
end-if
move byte-element (comp5-number) to test-byte
end-perform
display "COMP-5 end " function current-date

stop run.

Note that the repeat count is pushed up, otherwise the results are
statistically meaningless.Tests repeated 5 times with +- 1/100 second
difference.

Results from Linux boxen (in single-user mode)
(As all benchmarks should be done on 'nix systems)
(32 bit is P4 prescott with 3.2GhZ)
(64 bit is P4

MF SE 2.2 (Linux x86 32 bit)
cob -u -O -C notrunc -C sourceformat=free speed5.cob
cobrun speed5
Start prog 2007092612363397+0200
Index start 2007092612363397+0200
Index end 2007092612363681+0200
COMP start 2007092612363681+0200
COMP end 2007092612364047+0200
COMP-5 start 2007092612364047+0200
COMP-5 end 2007092612364361+0200
Index start 2007092612364361+0200
Index end 2007092612364672+0200
COMP start 2007092612364672+0200
COMP end 2007092612365034+0200
COMP-5 start 2007092612365034+0200
COMP-5 end 2007092612365357+0200

OC 0.33 current -
cobc -x -O2 -std=mf -free speed5.cob
../speed5
Start prog 2007092612311407+0200
Index start 2007092612311407+0200
Index end 2007092612311690+0200
COMP start 2007092612311690+0200
COMP end 2007092612312044+0200
COMP-5 start 2007092612312044+0200
COMP-5 end 2007092612312326+0200
Index start 2007092612312326+0200
Index end 2007092612312609+0200
COMP start 2007092612312609+0200
COMP end 2007092612312963+0200
COMP-5 start 2007092612312963+0200
COMP-5 end 2007092612313246+0200

OC 0.33 current on Linux x86_64 (64 bit)
cobc -x -O2 -std=mf -free speed5.cob
../speed5
Start prog 2007092612285455+0200
Index start 2007092612285455+0200
Index end 2007092612285602+0200
COMP start 2007092612285602+0200
COMP end 2007092612285855+0200
COMP-5 start 2007092612285855+0200
COMP-5 end 2007092612290004+0200
Index start 2007092612290004+0200
Index end 2007092612290135+0200
COMP start 2007092612290135+0200
COMP end 2007092612290366+0200
COMP-5 start 2007092612290366+0200
COMP-5 end 2007092612290497+0200

Now as to what has all been said in this thread, then I have the
following comments -
COMP (aka BINARY) is stored as big-endian by all
compilers these days.
Therefore there is a penalty on little-endian machines
(or better the OS/firmware for eg. bi-endian) to
byte-swap, operate and re-byteswap results.
This, of course, affects eg. x86(_64).
However, see below

Alignment -
There are in fact not that many alignment tolerant machines there.
Intel x86(_64) and Power PC are known. (The Itanium is not)
This means that any reference to a COMP/COMP-5 item must
be moved to an intermediate area unless it can be proved at compile
time that it is appropiately aligned. (eg. at 01 level)

So we have to look at a bisection of the above two attributes.
Generally speaking, for performance, (other than INDEX)
one should use COMP-5 (aka BINARY-LONG SIGNED/UNSIGNED)
for subscripts/counters etc. and define them at the 01 level.

Not only that, a particular compiler implementation has it's
own INDEX definition which is somewhat difficult to ascertain.
(And which is not necessarily a C-5 item)

Roger



From: Pete Dashwood on


"Arnold Trembley" <arnold.trembley(a)worldnet.att.net> wrote in message
news:zjmKi.137851$ax1.11998(a)bgtnsc05-news.ops.worldnet.att.net...
> Pete Dashwood wrote:
>> (snip) Here are the results of "Speed2" from a genuine Intel Celeron Core
>> 2 Duo Vaio AR250G notebook with 2 GB of main memory, running under
>> Windows XP with SP2 applied, using your code (with the following
>> amendments: all asterisks and comments removed, exit perform cycle
>> removed), compiled with no options other than the defaults (which
>> includes "Optimize"), with the Fujitsu NetCOBOL version 6 compiler,
>> compiled to .EXE:
>>
>> Null test 1
>> Index 3
>> Subscript 25
>> Subscript comp-5 3
>> Index 1 3
>> Subscript 1 22
>> Subscript 1 comp-5 3
>>
>> As you can see, indexing is between 7 and 8 times more efficient than
>> subscripting, unless you use optimized subscripts, in this environment.
>
> Here are the results of "Speed2" using a 2.60 GHz Pentium 4 with 512 MB of
> main memory, running under Windows XP with SP2 applied, using Robert's
> code with EXIT PERFORM CYCLE commented out, compiled with a 1990 education
> version of Realia COBOL (equivalent to Realia 3):
>
> Null test 5
> Index 2
> Subscript 8
> Subscript comp-5 8
> Index 1 2
> Subscript 1 7
> Subscript 1 comp-5 7

That is SOOO cool!

Obviously, generated code makes all the difference. Here's code from 2
compilers running in the same OS Environment, yet look at the figures for
subscripts; the P4 creams the Core 2, although the Core 2 is theoretically
faster. In fact, the P4 is faster on everything except the null test :-) And
both systems are way faster than IBM mainframes. (That still hasn't quite
sunk in yet; after working on mainframes for decades it is hard for me to
realize that a notebook costing < .01% of what a mainframe costs, could be
orders of magnitude faster...)

Again, to me at least, this just completely confirms that it is not possible
to make meaningful statements about performance unless you run actual tests.

Pete.
--
"I used to write COBOL...now I can do anything."