From: Peter Grandi on
[ ... ]

>> In other words, my contention is that a language with properly
>> implemented parallel assignment has (if the parallel assignment
>> are used) the advantages of functional languages without the
>> sometimes inconvenient restrictions.

> A related point that I have become more and more convinced about
> is that tail recursion would be quite a useful structuring
> mechanism even outside of functional programming.

GCC optimizes tail recursion... At least simple cases. I think it
has become more extensive with time.

Anyhow tail recursion is a very useful descriptive device even when
it is not optimized into stack reuse, it just makes clearer program
flow and programmer intent.

> A tail call combines a parallel assignment with a jump, which
> makes any ILP pretty obvious, and it also allows reasoning about
> programs in terms of largish atomic state changes.

That's one way of putting it. But it still uses something that
looks more like binding rather than proper parallel assignment,
which has weaker and thus more optimizable semantics than binding
as defined in most languages (aliasing...).

> Two languages that I know where tail calls are explicitly used
> for this are Scheme and Erlang. In Scheme, iteration constructs
> are often expressed using functions as follows:

> (define (iterate a b c)
> ... some code here...
> (iterate (compute-new-a) (compute-new-b) (compute-new-c)))

That kind of pattern more often looks slightly differently, with
a main function calling a tail recursive function with some extra
parameters. A bit of Emacs Lisp for an excessively simple example:

(defun factorial (n)
(factorial-step 1 n)
)

(defun factorial-step (i j)
(if (< j 2) i
(factorial-step (* i j) (- j 1))
)
)

IIRC this is mentioned in "Anatomy of Lisp" by Allen (one of those
extraordinary classics).

Then there was Greussay's splendid article and thesis showing how
by rearranging slightly things tail recursive implementations of
*mutually recursive* procedures are free.
From: Nick Maclaren on

In article <yf31w4gog6a.fsf(a)tree.gp.example.com>,
pg_nh(a)0710.exp.sabi.co.UK (Peter Grandi) writes:
|>
|> Anyhow tail recursion is a very useful descriptive device even when
|> it is not optimized into stack reuse, it just makes clearer program
|> flow and programmer intent.

I strongly disagree.

In a functional language, or even a functional programming paradigm,
that is true. But, in a language and programming paradigm where
much or most of the 'importance' is in the side-effects, I find that
tail recursion obfuscates both the program flow and the programmer's
intent.

That isn't just my personal dislike, either, but is based on helping
people who have had trouble in a good many imperative languages:
Algol 68, BCPL, C, Fortran and others.


There is also a secondary menace of the technique, caused by code
like:

Function fred
If ...
...
Return joe(...)
Else
... lots of code ...
Return <data value>
Endif
Endfunction

I and others have wasted HOURS due to that "gotcha" - again, it is
the combination of that with non-functional features that causes the
problem.


Regards,
Nick Maclaren.

From: Peter Grandi on
[ ... ]

>> Anyhow tail recursion is a very useful descriptive device
>> even when it is not optimized into stack reuse, it just makes
>> clearer program flow and programmer intent.

> I strongly disagree. In a functional language, or even a
> functional programming paradigm, that is true. But, in a
> language and programming paradigm where much or most of the
> 'importance' is in the side-effects, I find that tail recursion
> obfuscates both the program flow and the programmer's intent.

Uhm, there is a point about side effects, but that affects all
forms of recursion.

However I think that as to *descriptive* I have a stronger point,
which is part of a general argument.

The general argument is that there is a natural correspondence
between data structures and control structures, which goes like
this, in a very simplified form:

data structure control structure

constant expression
variable assignment
record block ('with', parallel assignment
array repetition (bounded, 'for')
streams repetition (logical, 'while')
frame, hash repetition (keyed, 'foreach')
lists recursion (tail)
binary trees mixed recursion
trees recursion
acyclic graphs iterators
general graphs generators

NOTES:
* some of the distinctions above are a bit fine.
* 'with' means a Pascal-like 'with' block that 'opens'
a record's scope.
* a frame is a record with potentially changing number and
names of field.

The correspondence means that to explore/enumerate a certain class
of data structure it is most proper/descriptive to use the
corresponding control structure.

My practice is not just that using the proper-level control
structure for each data structure category is quite important
(and from your argument above you seem to agree), but that there
are data structures that are defined recursively like lists but
in essence without requiring a stack, and for these tail
recursion is most appropriate.

Consider the three middle structures above:

array: defined as a bounded repetition of N "contiguous"
elements, control structure is 'for'.

stream: defined as an unbounded repetition of "contiguous"
elements (\0 terminated string, files), control
structure is 'while'.

list: defined as a non-branching tree of independent,
non-contiguous nodes.

For the latter tail recursion seems to me to be ideal: because a
linear tree of nodes is not the same as a stream of values, and
I'd rather not use the same control structure to enumerate both.
Consider the length of a string or a single headed list in GNU C:

unsigned strlen1(const char *const s)
{
register unsigned n = 0;
register const char *p = s;
while (*p != 0)
n++;
return n;
}

unsigned strlen2(const char *const s)
{
inline unsigned (strlen2step(const unsigned n,const char *const s)
{ return (s[0] == '\0') ? n : strlen2step(n+1,&s[1]);
return strlen2step(0,s);
}

unsigned listlen1(const struct list *const l)
{
register unsigned n = 0;
register const list *p = l;
while (p != 0)
{ p = p->next; n++; }
return n;
}

unsigned listlen2(const struct list *const l)
{
inline unsigned listlen2step(const unsigned n,const struct list *const l)
{ return (l == 0) ? n : listlen2step(n+1,l->next); }
return listlen2do(0,l0);
}

Of these I'd prefer as 'strlen1' and 'listlen2', as they match the
underlying structure better than 'strlen2' and 'listlen2'.

Let me put it another way: a sequence (bounded as an array,
unbounded as a stream/string) is fundamentally different from a
list because while both are unidimensional, the sequence cannot
but be so, and it has more of an cardinal nature (most sequences
can be easily accesses positionally), while lists are more
ambiguous, being degenerate trees, and they have more of an
ordinal nature like trees and graphs.

Stylistically I'd rather distinguish the exploring/enumerating
a sequence from a list.

Still another way: Dijkstra powerfully argued that using
recursion when looping is sufficient is a bad idea, basing the
argument on the observation that looping creates a relationship
between two predicates (like a block or an assignment or an
expression), while recursion between predicate transformers, and
there is a significant semantic jump between the two.

Well, in my view in that significant semantic jump there is an
intermediate step for lists and tail recursion, because of that
cardinal to ordinal jump.

Finally, I will explain a special case that the observant will
have spotted above, which is binary trees.

These are a special case because they can be interpreted as a
linear list with branches; visualize it not has having left and
right branches, but next and down ones.

This means that a very common way of exploring/enumerating them
is by linear progression in the next direction, and recursion
in the down direction.

In that case it is fairly natural to 'while' nextwards and
recurse downwards. But it is I think more appropriat to
tail-recurse righward and recurse downward, because after all it
is a tree. Consider the count of elements in a binary tree in
GNU C:

unsigned btlen1(const struct bt *const t)
{
register unsigned n = 0;
register const struct bt *p = t;
while (p != 0)
{ if (p->down != 0) n += btlen1(p->down); n++; p = p->next; }
return n;
}

unsigned btlen2(const struct bt *const t)
{
inline unsigned btlen2step(const unsigned n,const struct bt *const t)
{ return (t == 0) ? n
: btlen2step(n+((t->down == 0) ? 0 : btlen2(t->down))+1,t->next); }
return btlen2step(0,t);
}

where I prefer 'btlen2' to 'btlen1'.

NOTES:
* I would not write these functions quite like that, this
particular shape is to illustrate my point, but still good.
* GCC 4.1 does compile 'btlen2step' as a loop, and even does
loop unrolling. This did amuse me.

Some general points:

* Yes, my preferred C programming style is quite expression
oriented, Algol 68-ish, functional. Also because with GCC it
optimizes particularly well specially with asm inlines and
builtins, e.g. SSE.

* Yes, I try to be quite tidy and precise in the descriptive
style of my coding. It costs no more than messy hard to
follow code. Also because I prefer programs that require
very little debugging and to have very few lifetime bugs.
From: Peter Grandi on
[ ... ]

> The general argument is that there is a natural correspondence
> between data structures and control structures, which goes like
> this, in a very simplified form:

> data structure control structure

> constant expression
> variable assignment
> record block ('with', parallel assignment
> array repetition (bounded, 'for')
> streams repetition (logical, 'while')
> frame, hash repetition (keyed, 'foreach')
> lists recursion (tail)
> binary trees mixed recursion
> trees recursion
> acyclic graphs iterators
> general graphs generators

The architectural implications, especially as to optimization of
bulk operation, should be fairly obvious.

Also, my classification above is very much indebted to a few
great bits of literature and research, regrettably like so many
others largely lost in obscurity:

* Chapter 2 "On data structures" of "Stuctured programming".
* Chapter 3 "Hierarchical program structures" of "Structured
programming".
* POP2 and MDL, and to some extent EL1.

[ ... ]

As to "Structured programming", whose chapters 2 and 3 are so
uniquely important yet obscure even when it was a hot book, some
years ago I listened to D. Knuth at a lecture saying that he had
asked AP if it still was selling, and he was told that indeed it
was still selling, *three* copies per year. O tempora, o mores :-).

But then I think this classic cartoon tragically realistic:

http://www.iBiblio.org/Dave/Dr-Fun/df200002/df20000210.jpg

From: Greg Lindahl on
In article <yf31w4gog6a.fsf(a)tree.gp.example.com>,
Peter Grandi <pg_nh(a)0710.exp.sabi.co.UK> wrote:

>GCC optimizes tail recursion... At least simple cases. I think it
>has become more extensive with time.

I think one of the SPECcpu benchmarks has a significant opportunity
with tail recursion, certainly it's something that most
performance-oriented compilers do.

-- greg