From: nmm1 on
In article <jwvhbn7bkri.fsf-monnier+comp.arch(a)gnu.org>,
Stefan Monnier <monnier(a)iro.umontreal.ca> wrote:
>>> 2.9: Register rotation, someone needs to be locked in a rubber room. ;)
>> Yes and no. They can be VERY effective, as on the Hitachi SR2201.
>
>How did they work?

Floating-point only, software-controlled, bypassing the cache.
It was called pseudovectorisation, which describes it very well.
Good, vectorisable, Fortran code ran like the clappers, as the
technique almost completely eliminated memory latency. To achieve
that, it dropped (most) support of IEEE denorms etc., and you had
a heavyweight switch between vector and scalar modes.

A slightly different form was used on the SR8000, but I have no
personal experience of that. While it could have been extended
to other forms of use, that wouldn't have been very far. Similarly,
older systems that used register rotation for procedure calls had
to put quite a lot of restrictions on that would fit badly with
some modern languages.

I think that it could be used on a general-purpose CPU, but ONLY
if you designed the architecture round it - not, as on the Itanic,
bolting it on together with every knob, frob, bell and whistle
that you could find in the lumber room.


Regards,
Nick Maclaren.
From: Robert Myers on
On Apr 18, 11:25 pm, Brett Davis <gg...(a)yahoo.com> wrote:

> It is my opinion that Itanic is a disaster at any speed. ;)

Andy seemed to express his opinion that there were too many opinions
to begin with.

If you could design-by-opinion, I'm sure that development would be
much less expensive.

Although Andy has not mentioned it, I suspect that the compiler-that-
never-really-arrived played a significant role in keeping the design
process a battle of opinions for a very long time.

Intel managed to keep Itanium at or near the lead in a number of SPEC
benchmarks, and I assume that it was compiler tuning that allowed them
to do that.

Intel's success at that enterprise, I'm sure, left end users puzzled
as to why the chip never look as good in their applications at it did
in the benchmarks.

Internally, you could say, "See. The compiler *can* do it, if only in
a limited number of cases."

Burning watts at runtime to schedule and virtually mandating clever
and sometimes obscure hand-coding in order to achieve acceptable
performance are both inferior to having a compiler that can schedule
naive code successfully enough to compete with run-time scheduling.

Beating up on the feature set of Itanium seems pretty pointless. The
question that still begs to be answered is: how much can you push out
to a compiler (and not tricky hand coding) and how do you do it? The
features that were added to Itanium, whatever their merits or obvious
disadvantages, don't seem to have helped enough. The question remains
whether one could do better.

Robert.
From: Anton Ertl on
nmm1(a)cam.ac.uk writes:
>In article <jwvhbn7bkri.fsf-monnier+comp.arch(a)gnu.org>,
>Stefan Monnier <monnier(a)iro.umontreal.ca> wrote:
>>>> 2.9: Register rotation, someone needs to be locked in a rubber room. ;)
>>> Yes and no. They can be VERY effective, as on the Hitachi SR2201.
>>
>>How did they work?
>
>Floating-point only, software-controlled, bypassing the cache.
>It was called pseudovectorisation, which describes it very well.
>Good, vectorisable, Fortran code ran like the clappers

Sounds like IA-64, in particular Itanium II.

>I think that it could be used on a general-purpose CPU, but ONLY
>if you designed the architecture round it - not, as on the Itanic,
>bolting it on together with every knob, frob, bell and whistle
>that you could find in the lumber room.

It was used on IA-64, an architecture intended to be general-purpose.
And it did run vectorizable loops fast. The problem is that the
performance for most other stuff is mediocre, mostly because the clock
rate does not keep up with the competition.

- anton
--
M. Anton Ertl Some things have to be seen to be believed
anton(a)mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html
From: Stefan Monnier on
>>>> 2.9: Register rotation, someone needs to be locked in a rubber room. ;)
>>> Yes and no. They can be VERY effective, as on the Hitachi SR2201.
>> How did they work?
> Floating-point only, software-controlled, bypassing the cache.
> It was called pseudovectorisation, which describes it very well.

By "work" I meant: what were the instructions provided to setup/control
the register rotation feature and what were their semantics?


Stefan
From: nmm1 on
In article <jwviq7n9wwm.fsf-monnier+comp.arch(a)gnu.org>,
Stefan Monnier <monnier(a)iro.umontreal.ca> wrote:
>>>>> 2.9: Register rotation, someone needs to be locked in a rubber room. ;)
>>>> Yes and no. They can be VERY effective, as on the Hitachi SR2201.
>>> How did they work?
>> Floating-point only, software-controlled, bypassing the cache.
>> It was called pseudovectorisation, which describes it very well.
>
>By "work" I meant: what were the instructions provided to setup/control
>the register rotation feature and what were their semantics?

It was 12+ years ago now, and I didn't program in assembler on that
system, anyway. I might have a specification somewhere, but I would
have to search.


Regards,
Nick Maclaren.