From: Eugene Miya on
>>>> I thought that pipelining went back at least to Richard Feynman's use
>>>> of pipelining his "computers", the human kind, in the work on the
>>>> Manhattan Project. :-)
>>> Functionally this is true, but one might also argue that this was an
>>> assembly line technique which goes back to Henry Ford and others.
>> Sure. Mostly, I was trying to be funny and point out an perhaps the
>> first "computer" use of pipelining. I *did* put in a smiley after all.

In article <5bfi7bF2sp7rcU1(a)mid.individual.net>,
=?ISO-8859-1?Q?Jan_Vorbr=FCggen?= <jvorbrueggen(a)not-mediasec.de> wrote:
>However, noteworthy in this respect is his description of exception/fault
>handling. Once in a while, a computer made an error. When this was detected,
>instead of stopping the whole computation and restarting it, a special set of
>data (using a different colour of punched cards) was prepared containing the
>data before the error occured and surrounding it (in space). This was rushed
>through the computation until it caught up with the original but faulty data.
>Because they were doing physical simulation, the spreading of the faulty data
>was fairly easy to predict.

Ah, someone has been reading Surely or Badash's books or listen to the CD.

This technique was and is still done at LANL (formerly LASL) on finite
element problems. Unfortunately I long since tossed my E&S calendars
where someone at LASL was shown modifying an FE mesh graphically for
this purpose. Feynman's errors were mostly key entry errors. It's still
a problem in mesh generation.

>The climax comes when Feynman returns from the hospital in Santa Fe, where his
>wife has just died from tuberculosis. Nobody pays any attention to him,
>because his computers are doing an error correction to an error correction...

Well not quite dead. Those were SED guys. That was just before the test.

--
From: Alan Charlesworth on
In article <f2kue0$afg$1(a)joe.rice.edu>,
jle(a)forest.owlnet.rice.edu (Jason Lee Eckhardt) wrote:

> In article <dontlikespan-EF71FE.06082818052007(a)newsgroups.comcast.net>,
> Alan Charlesworth <dontlikespan(a)nowhere.com> wrote:
> >
> >> I'm curious if Alan Charlesworth (who posted in this thread
> >> as well) had any contact with the i860 designers wrt to the
> >> FP unit? Or did the designers borrow the ideas independently?
> >> Does Les Kohn lurk here, or Sai Wai Fu? I've always wanted
> >> to know the design details-- the IEEE Spectrum article in 1989
> >> talks more about staffing, as opposed to the technical details.
> >>
> >Nope. I stayed at FPS until it went under and its assets were bought by
> >Cray in 1992.
> >Those assets later turned into the Starfire E10K, which
> >were sold to Sun when SGI bought Cray in 1996 -- but that is a different
> >story. As for the i860, FPS used in a matrix coprocessor board, which
> >was an add-in to the FPS-164, a 64-bit follow-on to the AP-120B.
>
> Thanks Alan.
>
> Did any assets, especially the AP-120B math libraries or dev.
> tools (APAL, etc.), survive the asset sale to Cray? I'd like
> to resurrect an AP or 164 by writing an emulator and running some
> original, unmodified binaries-- as well as just general preservation
> of anything related to those machines (I've already done this
> for my other favorite manually-advanced pipeline machine, the i860).
>
> There are so many historical questions I'd love to learn about
> the AP design, about software pipelining[*], and the influence
> of FPS on other designs and techniques. I agree with Eugene, we
> need a massive braindump :)
>
> jason.
>
> [*] In the compiler literature, SWP is generally credited to Bob
> Rau, due to his 1981 MICRO-14 paper. But clearly the programming
> examples written by Alan in the "How to Program the AP-120B" manual
> in 1976 are software pipelined loops (though the phrase doesn't
> actually appear there), showing the idea to pre-date Rau's article.
> I suppose Rau could still be legitimately credited with having made
> the idea more systematic (the "modulo constraint", etc), and
> therefore easier to incorporate into a compiler.

We invented the term "software pipelining" circa 1975 to explaining to
people how to code loops for the hardware pipelines of the AP-120B.

Take an inner loop like a radix-2 FFT. It has something like 4
multiplies, 6 adds, and 8 memory accesses. Given one each adder,
multiplier, and memory functional units, the minimum cycles for this
loop was the maximum resource count, in this case eight cycles.

However I didn't write a paper until the early 80s for the 64-bit
version, the FPS-164. so publishing priority was elsewhere.
From: Nick Maclaren on

In article <dontlikespan-E896AD.06293623052007(a)newsgroups.comcast.net>,
Alan Charlesworth <dontlikespan(a)nowhere.com> writes:
|>
|> We invented the term "software pipelining" circa 1975 to explaining to
|> people how to code loops for the hardware pipelines of the AP-120B.

Ah! So YOU are the person to blame :-)

More seriously, I remember the term appearing in the early 1980s in
other contexts.

The term pipelining was a decade older, of course, as was the actual
technique of "software pipelining". But that term started to spread
as the concept of teaching programming techniques did - as distinct
from being dumped in at the deep end and being expected to derive
them for yourself, or learning by apprenticeship under (i.e. copying
the practices of) an experienced programmer.


Regards,
Nick Maclaren.
From: Eugene Miya on
>>> But go after John. Actually N. who programmed on the Harvest, I saw
>>> yesterday.

>In article <qhfy5urhwb.fsf(a)ruckus.brouhaha.com>,
>Eric Smith <eric(a)brouhaha.com> wrote:
>>The Harvest system (IBM 7950) is a superset of the Stretch system
>>(IBM 7030).
>>The Harvest coprocessor (IBM 7951 Processing Unit) was very unusual
>>and perhaps vaguely VLIWish.

OK, another week passes and I saw N. who used said unit.
I raised this issue and I got the response best summarized that
1) it might conceiveably be VLIW in theory but
2) in practice it wasn't.
And we went over the 2K bit instruction word (flexibility).
Not that this helps any of us as this is hindsight and other colleague
just gave me yet another Enigma reference which is a little more
interesting because its author has an axe to grind against the Brits and
the US Army (it is more actionable).

--
From: Eugene Miya on
In article <dontlikespan-E896AD.06293623052007(a)newsgroups.comcast.net>,
Alan Charlesworth <dontlikespan(a)nowhere.com> wrote:
>We invented the term "software pipelining" circa 1975 to explaining to
>people how to code loops for the hardware pipelines of the AP-120B.
>
>Take an inner loop like a radix-2 FFT. It has something like 4
>multiplies, 6 adds, and 8 memory accesses. Given one each adder,
>multiplier, and memory functional units, the minimum cycles for this
>loop was the maximum resource count, in this case eight cycles.
>
>However I didn't write a paper until the early 80s for the 64-bit
>version, the FPS-164. so publishing priority was elsewhere.

You mean these papers:

%A Alan Charlesworth
%A Eric Hall
%A John Maticich
%Z FPS
%T Measuring the Performance of the FPS-164 Architecture
%J Proceedings of 1983 Array Conference
%C Monterey, California
%D April 1983
%$ 25.00
%X Over the last decade, peripheral array processors have been widely used
in the fields of signal and image processing.
They are specialized to perform floating-point vector multiply-adds that
dominate such work, and hence are much more cost-effective than
general-purpose computers.
%X The FPS-164 is a derivative of the AP-120B that extends the
applicability of array processors into large-scale scientific computing.
A number of enhancements were required by this new market
(a program cache, large physical memory, and efficient compiler, and
a file system).
%X To evaluate the effectiveness of these enhancements,
a hardware run-time performance monitor was designed and implemented to
measure the dynamic resource utilization within the processor.
The gathered data has helped to fine-tune the FORTRAN compiler and
memory system, and to evaluate how well various applications fit
the architecture of the processor.


%A Alan E. Charlesworth
%T An Approach to Scientific Array Processing:
The Architectural Design of the AP-120B/FPS-164 Family
%J IEEE Computer
%V 14
%N 9
%D September 1981
%P 18-27
%K RBBRS346, bmiya,
VLIW, superscalar compiler techniques,
%X * Architecture, not hardware, gives the FPS array processor family
its computational speed. Here is an inside look at the trade-offs and
ideas
that went into its design.


%A Alan E. Charlesworth
%A John L. Gustafson
%T Introducing Replicated VLSI to Supercomputing:
the FPS-164/MAX scientific computer
%J IEEE Computer
%V 19
%N 3
%D March 1986
%P 10-23
%K array processing, SIMD, pipelining,


--