From: Eric P. on
Eugene Miya wrote:
>
> In article <481a50a0$1(a)news.meer.net>, Greg Lindahl <lindahl(a)pbm.com> wrote:
> >In article <481a30fc$1(a)darkstar>, Eugene Miya <eugene(a)cse.ucsc.edu> wrote:
> >>My impression is that stuff as simple as a move or op isn't enough to be
> >>VLIW. Forking and branching might be more challenging. X-MPs in 1982
> >>could do simultaneous 2-loads, a store, and arithmetic with mere pipelining.
> >>No long instructions needed.
> >
> >... I thought VLIW implied something other than just a lot of
> >instructions in flight.
>
> Yeah, I think that, too. So we have dealt with the quantitative.
> That brings up the qualitative. Might, in some case, we be running an
> ILP architecture in a lot of vector like patterns? Maybe. pause.
> I don't know, I think aside from arm chair quarterbacking, I think we
> have to get the Trace up and running again. I would want to see
> empirical stuff. Not conjecture.
>
> >The X-MP vector architecture doesn't look like
> >any of the classic VLIW architectures.
>
> Agreed.
>
> But the comparisons were for an architecture I had no familiarity with
> and the 360/370. And I'm not so certain the IBM's horizontally coded
> machines would qualify as VLIW/ILP.

This is actually a difficult distinction to draw but
I'll take a crack at it.

Horizontal microcode (of that era) and VLIW both have instructions
that use "a large number of bits" and use a fixed layout to dedicate
instruction fields for each function. Both expect the programmer
(or compiler) to track function unit latency and manually schedule
results. However that is where the similarity ends.

While horizontal microcode typically does allow concurrent actions,
in that there are multiple fields that can be turned on at the same
time, these are not the kind of actions that would be of interest
to a programmer. Horizontal microcode is dedicated to sequencing
the internal processor control lines. It wouldn't have the ability
to control multiple identical function units at the same time using
separate instruction fields. For example, they would not have two
ALU's and two control fields. Concurrency, if it is done at all,
is achieved by pipelining multiple microinstructions at the same time.
The microword would typically not contain fields to specify registers,
as these are given by the CISC instruction, and certainly not contain
the multiple register specifier fields required for multiple
concurrent operations.

A horizontal microword might have separate fields to control, say,
an ALU, a flash multiplier, and a shifter. However you would most
likely find that, because the machine was never designed to operate
these concurrently, there was a resource like a common bus that all
function units attach to.

VLIW does have the ability to perform program operations concurrently
because it has multiple function units, of each type, along with the
supporting infrastructure to move multiple operands and results.
So you might find 3 ALUs and 2 multipliers with the 15 associated
busses to move values from the registers file to each unit and back.
It then makes the function units available through 5 separate fields,
each with its own control and source and destination registers,
of the VLIW instruction.

Eric

From: Pertti Kellomäki on
Peter Grandi wrote:
> In other words, my contention is that a language with properly
> implemented parallel assignment has (if the parallel assignment
> are used) the advantages of functional languages without the
> sometimes inconvenient restrictions.

A related point that I have become more and more convinced about
is that tail recursion would be quite a useful structuring mechanism
even outside of functional programming. A tail call combines a
parallel assignment with a jump, which makes any ILP pretty
obvious, and it also allows reasoning about programs in terms
of largish atomic state changes.

Two languages that I know where tail calls are explicitly used
for this are Scheme and Erlang. In Scheme, iteration constructs
are often expressed using functions as follows:

(define (iterate a b c)
... some code here...
(iterate (compute-new-a) (compute-new-b) (compute-new-c)))

The semantics of the language dictate that the tail call to
iterate must not consume any extra space, which in practice
means that it is turned into a jump. The expressions (compute-new-X)
clearly stand out as candidates for ILP. For the reader, the
implicit parallel assignment is an atomic change to the state
variables a, b, and c.

In Erlang, a functional language for distributed programming (think
phone exchanges and the like), a similar technique is used for
expressing processes. A process keeps itself alive by making
tail recursive calls, and the state of the process is kept in
the parameters passed in the calls. This also provides a convenient
mechanism for hot-swapping in new code if the tail calls are
made indirectly via a module name space, but that's another
story.
--
Pertti
From: Nick Maclaren on

In article <fvmbgr$v47$1(a)news.cc.tut.fi>,
=?ISO-8859-1?Q?Pertti_Kellom=E4ki?= <pertti.kellomaki(a)tut.fi> writes:
|> Peter Grandi wrote:
|> > In other words, my contention is that a language with properly
|> > implemented parallel assignment has (if the parallel assignment
|> > are used) the advantages of functional languages without the
|> > sometimes inconvenient restrictions.
|>
|> A related point that I have become more and more convinced about
|> is that tail recursion would be quite a useful structuring mechanism
|> even outside of functional programming. A tail call combines a
|> parallel assignment with a jump, which makes any ILP pretty
|> obvious, and it also allows reasoning about programs in terms
|> of largish atomic state changes.

Grrk. My experience is that tail recursion is useful, as you say,
but also a pain in the neck. When it is optimised out, it makes it
vastly harder for debuggers and tuning tools to display any useful
information. Even if it is not optimised out, tracebacks are much
less useful.

So you win some and lose some :-(

|> In Erlang, a functional language for distributed programming (think
|> phone exchanges and the like), a similar technique is used for
|> expressing processes. A process keeps itself alive by making
|> tail recursive calls, and the state of the process is kept in
|> the parameters passed in the calls. This also provides a convenient
|> mechanism for hot-swapping in new code if the tail calls are
|> made indirectly via a module name space, but that's another
|> story.

A really ancient technique (the 360/370 operating system used it) that
was brought into high-level languages some 20+ years ago. See BCPL
coroutines, for example. Yes, I agree that it is probably the best
way to support multiple non-parallel processes. MUCH cleaner than
POSIX-style threading.


Regards,
Nick Maclaren.
From: Pertti Kellomäki on
Nick Maclaren wrote:
> Grrk. My experience is that tail recursion is useful, as you say,
> but also a pain in the neck. When it is optimised out, it makes it
> vastly harder for debuggers and tuning tools to display any useful
> information. Even if it is not optimised out, tracebacks are much
> less useful.
>
> So you win some and lose some :-(

I took a bit of a beating over this in comp.lang.lisp a while
ago, so I know about the drawbacks ;-)

However, when tail recursion is used explicitly for creating
a computation that runs in constant space, then the alternative
is to use some kind of loop or goto, which also do not leave
any backtrace.

The best of both worlds would probably be special syntax for the
tail calls that should be optimized away.
--
Pertti
From: Nick Maclaren on

In article <fvml5k$4fl$1(a)news.cc.tut.fi>,
=?ISO-8859-1?Q?Pertti_Kellom=E4ki?= <pertti.kellomaki(a)tut.fi> writes:
|>
|> > Grrk. My experience is that tail recursion is useful, as you say,
|> > but also a pain in the neck. When it is optimised out, it makes it
|> > vastly harder for debuggers and tuning tools to display any useful
|> > information. Even if it is not optimised out, tracebacks are much
|> > less useful.
|> >
|> > So you win some and lose some :-(
|>
|> I took a bit of a beating over this in comp.lang.lisp a while
|> ago, so I know about the drawbacks ;-)

That's the place that has the experience :-)

|> However, when tail recursion is used explicitly for creating
|> a computation that runs in constant space, then the alternative
|> is to use some kind of loop or goto, which also do not leave
|> any backtrace.

Agreed.

|> The best of both worlds would probably be special syntax for the
|> tail calls that should be optimized away.

It's a reasonable approach, certainly.


Regards,
Nick Maclaren.