From: Laurent Deniau on
I am writing some C code to support multimethod dispatch and multiple
inheritance (aggregation -> Tree only, no DAG) within the context of
class-based object model in ISO C99. Technically, no problem arose and I
have everything in hand. All the problem I have are design problems
(choices). Since I am not very familliar with "behavior/verb oriented
programming" in large scale projects, I was wondering if some
experienced programmers with this kind of design (CLOS, Dylan, Cecil,
Slate, Python 2.3+, Perl6, ...) could answer to some questions while
keeping in mind that I would like to keep things as simple as possible
from the user point of view:

1- Do you thing that multiple inheritance is useful if multimethod and
dynamic inheritance (dynamic change of instance class) are supported?
The price to pay for MI vs SI would be a slow down of about 20-40% of
message dispatch speed and an increase of about 200-400% of the dispatch
code (call-site) plus some more syntax for super-classes calls
(call-site) since the simple call-next-method is not enough. Actually, I
perform message single-dispatch at the speed of 2.2-2.5 for SI and
2.7-3.2 for MI where 1.0 is the reference speed of a function being
called through a pointer. Dispatch code is *not* inlined to avoid code
bloat, therefore the asymptote is 2.0. MI would also imply a slight
increase of the caches sizes since the objects offset adjustment must
also be stored.

My feeling is that MI (Tree only) is not essential when multimethod and
dynamic inheritance (predicate/state dispatch) are available. The only
useful application I foresee is the easy combination of two (nearly
orthogonal) class implementations which can also be implemented manually
using composition.

2- When using a call-next-method idiom and considering the class
hierarchy: A <: B <: C, the possible left-to-right method specialization
are lexical order:
f(C,C) f(C,B) f(C,A) f(B,B) f(B,A), f(A,A)
or lexical+distance order:
f(C,C) f(C,B) f(B,C) f(C,A) f(B,B) f(A,C) f(B,A) f(A,B) f(A,A)
Which one is the best? Note that even with MI, C3 (or Lin2)
linearization do not apply here (Tree only, no DAG).

My feeling is that the latter case is theoretically better, but much
less intuitive than the former. For example, the transition f(B,B) ->
f(A,C) may not be expected by the programmer since he is not necessarily
aware of C. Note that it also happens in the C3 linearization (Python
2.3+, Perl6).

Thanks for your time.

regards,

Laurent.
From: Dmitry A. Kazakov on
On Mon, 13 Mar 2006 12:38:38 +0100, Laurent Deniau wrote:

> 1- Do you thing that multiple inheritance is useful if multimethod and
> dynamic inheritance (dynamic change of instance class) are supported?
> The price to pay for MI vs SI would be a slow down of about 20-40% of
> message dispatch speed and an increase of about 200-400% of the dispatch
> code (call-site) plus some more syntax for super-classes calls
> (call-site) since the simple call-next-method is not enough.

What model you have? If dispatching tables are associated with functions
then it is the dimension of the table to determine the penalty. MI, alone,
should play no role here, as a guess.

> My feeling is that MI (Tree only) is not essential when multimethod and
> dynamic inheritance (predicate/state dispatch) are available. The only
> useful application I foresee is the easy combination of two (nearly
> orthogonal) class implementations which can also be implemented manually
> using composition.

MI is essential for interfaces. Consider some container library defining
Sortable, some GUI library defining Printable and some network library
defining Streamable etc. Now you want to use all of them for some given
A...

> 2- When using a call-next-method idiom and considering the class
> hierarchy: A <: B <: C, the possible left-to-right method specialization
> are lexical order:
> f(C,C) f(C,B) f(C,A) f(B,B) f(B,A), f(A,A)
> or lexical+distance order:
> f(C,C) f(C,B) f(B,C) f(C,A) f(B,B) f(A,C) f(B,A) f(A,B) f(A,A)
> Which one is the best? Note that even with MI, C3 (or Lin2)
> linearization do not apply here (Tree only, no DAG).
> My feeling is that the latter case is theoretically better, but much
> less intuitive than the former. For example, the transition f(B,B) ->
> f(A,C) may not be expected by the programmer since he is not necessarily
> aware of C. Note that it also happens in the C3 linearization (Python
> 2.3+, Perl6).

I tend to behavioral (substitutability) point of view. When A <: B <: C,
then I presume that there exist safe specialization conversions upwards the
type hierarchy:

B_from_A : A -> B
C_from_B : B -> C

This allows to create f(C,B) as a composition in the second parameter:

f(C,C) o (1 x C_from_B)

Here 1 is identity function. f(B,B) is a composition in both parameters:

f(C,C) o (C_from_B x C_from_B)

For safety reasons, in terms of preferences, it should be so that methods
like f(B,B) and f(A,C) be incomparable. Though they should never conflict,
in a properly designed types system. When f(B,B), f(B,C), f(C,B) are
inherited by A, they are as compositions:

f(A,A) = f(B,B) o (B_from_A x B_from_A)
f(B,A) = f(B,B) o (1 x B_from_A)
f(A,B) = f(B,B) o (B_from_A x 1)
f(A,C) = f(B,C) o (B_from_A x 1)
f(C,A) = f(C,B) o (1 x B_from_A)

These function act as implicit overrides. So A x A goes to B x B. But if
f(A,C) gets explicitly overridden, the compiler should require the
programmer to do it for all other four variants. If f(B,A) get overridden
then the compiler should require to override f(A,B), f(A,A) too.

With out parameters (mutators) it goes slightly more complex, because safe
donwwards conversions need to be considered as well, but the principle
should remain: f(B,B) and f(A,C) never match AxA.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
From: Laurent Deniau on
Dmitry A. Kazakov wrote:
> On Mon, 13 Mar 2006 12:38:38 +0100, Laurent Deniau wrote:
>
>
>>1- Do you thing that multiple inheritance is useful if multimethod and
>>dynamic inheritance (dynamic change of instance class) are supported?
>>The price to pay for MI vs SI would be a slow down of about 20-40% of
>>message dispatch speed and an increase of about 200-400% of the dispatch
>>code (call-site) plus some more syntax for super-classes calls
>>(call-site) since the simple call-next-method is not enough.
>
>
> What model you have?

Right, I forgot to mention that the dispatch model is based on generics
(Dylan, CLOS) at which methods are attached to and uses dynamic method
dispatch through cache (global hash table per thread) plus table lookup
(cache miss) and does not use late binding through precomputed
compressed multiple-entries static tables.

> If dispatching tables are associated with functions
> then it is the dimension of the table to determine the penalty.
> MI, alone, should play no role here, as a guess.

In fact MI plays a role on both sides: on the dispacher side because
object's offset adjustment must be stored in the cache but also on the
call-site side because the address of the object must be passed to the
dispatcher to apply this offset adjusment, and this (roughly) double the
code generated to dispatch the message.

>>My feeling is that MI (Tree only) is not essential when multimethod and
>>dynamic inheritance (predicate/state dispatch) are available. The only
>>useful application I foresee is the easy combination of two (nearly
>>orthogonal) class implementations which can also be implemented manually
>>using composition.
>
>
> MI is essential for interfaces. Consider some container library defining
> Sortable, some GUI library defining Printable and some network library
> defining Streamable etc. Now you want to use all of them for some given
> A...

Interfaces are related to MI with static typing + late->dynamic binding
and are meaningless when you have dynamic method dispatch. Instead of
dynamically looking for Sortable and then invoke Sortable.sort on the
container, you just send the sort message to the container or check for
its support with respondsTo(sort).

>>2- When using a call-next-method idiom and considering the class
>>hierarchy: A <: B <: C, the possible left-to-right method specialization
>>are lexical order:
>> f(C,C) f(C,B) f(C,A) f(B,B) f(B,A), f(A,A)
>>or lexical+distance order:
>> f(C,C) f(C,B) f(B,C) f(C,A) f(B,B) f(A,C) f(B,A) f(A,B) f(A,A)
>>Which one is the best? Note that even with MI, C3 (or Lin2)
>>linearization do not apply here (Tree only, no DAG).
>>My feeling is that the latter case is theoretically better, but much
>>less intuitive than the former. For example, the transition f(B,B) ->
>>f(A,C) may not be expected by the programmer since he is not necessarily

Sorry, I should have written f(B,B)->f(C,A) since the lists above are
from the less to the more specialized functions.

>>aware of C. Note that it also happens in the C3 linearization (Python
>>2.3+, Perl6).
>
>
> I tend to behavioral (substitutability) point of view. When A <: B <: C,
> then I presume that there exist safe specialization conversions upwards the
> type hierarchy:
>
> B_from_A : A -> B
> C_from_B : B -> C
>
> This allows to create f(C,B) as a composition in the second parameter:
>
> f(C,C) o (1 x C_from_B)
>
> Here 1 is identity function. f(B,B) is a composition in both parameters:
>
> f(C,C) o (C_from_B x C_from_B)
>
> For safety reasons, in terms of preferences, it should be so that methods
> like f(B,B) and f(A,C) be incomparable. Though they should never conflict,
> in a properly designed types system. When f(B,B), f(B,C), f(C,B) are
> inherited by A, they are as compositions:
>
> f(A,A) = f(B,B) o (B_from_A x B_from_A)
> f(B,A) = f(B,B) o (1 x B_from_A)
> f(A,B) = f(B,B) o (B_from_A x 1)
> f(A,C) = f(B,C) o (B_from_A x 1)
> f(C,A) = f(C,B) o (1 x B_from_A)

Ok.

> These function act as implicit overrides. So A x A goes to B x B. But if
> f(A,C) gets explicitly overridden, the compiler should require the
> programmer to do it for all other four variants. If f(B,A) get overridden
> then the compiler should require to override f(A,B), f(A,A) too.
>
> With out parameters (mutators) it goes slightly more complex, because safe
> donwwards conversions need to be considered as well, but the principle
> should remain: f(B,B) and f(A,C) never match AxA.

I have the feeling that you are focusing on static type checking. I
expect that when you say "For safety reasons" you are referring to the
constraints applied to parameters (covariance / contravariance of
types). But these constraints do not apply to multimethod dynamic
dispatch because the overriding mechanism differs from static type
systems. For example, as you say, in a static type system like in C++ or
Java, it is important that f(B,B), f(B,C), f(C,B) use different slots in
the virtual table(s) and do not override each other (limiting somehow
the subtyping polymorphism), while in dynamic dispatch they use the same
slot: f(B,B) overrides f(B,C) which overrides the less specialized
method f(C,B), if we consider my 2nd list of linearization. This is
still safe since the dispatcher always propagates the most derived type
i.e. A.

Thanks for your answer.

regards,

Laurent.
From: Dmitry A. Kazakov on
On Mon, 13 Mar 2006 18:30:03 +0100, Laurent Deniau wrote:

> Dmitry A. Kazakov wrote:
>> On Mon, 13 Mar 2006 12:38:38 +0100, Laurent Deniau wrote:
>>
>> What model you have?
>
> Right, I forgot to mention that the dispatch model is based on generics
> (Dylan, CLOS) at which methods are attached to and uses dynamic method
> dispatch through cache (global hash table per thread) plus table lookup
> (cache miss) and does not use late binding through precomputed
> compressed multiple-entries static tables.

I see.

>> If dispatching tables are associated with functions
>> then it is the dimension of the table to determine the penalty.
>> MI, alone, should play no role here, as a guess.
>
> In fact MI plays a role on both sides: on the dispacher side because
> object's offset adjustment must be stored in the cache but also on the
> call-site side because the address of the object must be passed to the
> dispatcher to apply this offset adjusment, and this (roughly) double the
> code generated to dispatch the message.

I wouldn't count it on account of dispatch, because even if the target is
statically known, you will need to shift the address. In fact, it is type
conversion penalty.

> Interfaces are related to MI with static typing + late->dynamic binding
> and are meaningless when you have dynamic method dispatch.

I agree. MI and interfaces (as a case of MI) have sense only if you want to
statically constrain classes (set of types.) If there is only one class =
everything is derived from one root and everything is treated as such, then
MI would make little sense.

>> These function act as implicit overrides. So A x A goes to B x B. But if
>> f(A,C) gets explicitly overridden, the compiler should require the
>> programmer to do it for all other four variants. If f(B,A) get overridden
>> then the compiler should require to override f(A,B), f(A,A) too.
>>
>> With out parameters (mutators) it goes slightly more complex, because safe
>> donwwards conversions need to be considered as well, but the principle
>> should remain: f(B,B) and f(A,C) never match AxA.
>
> I have the feeling that you are focusing on static type checking.

Right. To me the dynamic case is rather uninteresting, at least it is not
so challenging... (:-))

> I
> expect that when you say "For safety reasons" you are referring to the
> constraints applied to parameters (covariance / contravariance of
> types). But these constraints do not apply to multimethod dynamic
> dispatch because the overriding mechanism differs from static type
> systems. For example, as you say, in a static type system like in C++ or
> Java, it is important that f(B,B), f(B,C), f(C,B) use different slots in
> the virtual table(s) and do not override each other (limiting somehow
> the subtyping polymorphism), while in dynamic dispatch they use the same
> slot: f(B,B) overrides f(B,C) which overrides the less specialized
> method f(C,B), if we consider my 2nd list of linearization. This is
> still safe since the dispatcher always propagates the most derived type
> i.e. A.

Hmm, I don't think that there is some special "dynamic" way. Whether f(B,B)
may override f(B,C) depends solely on the relations between B and C. It
depends on whether B is a specialization or generalization if C and on the
parameter mode (in, out or in out.) The problem is that substitutability is
not checkable either statically or dynamically, so the language should
rather rely on the programmer here. But this also means that it should warn
the programmer about potentially suspicious cases. Especially for
multimethods, I had truly horror stories. One was related to malfunctioned
"<" and "=" in a deep types hierarchy. It broke down containers in a very
nasty way. Such errors are extremely difficult to track down. So, yes, I do
think it must be static, though I cannot tell how... (:-()

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
From: Laurent Deniau on
Dmitry A. Kazakov wrote:
> On Mon, 13 Mar 2006 18:30:03 +0100, Laurent Deniau wrote:
>>Dmitry A. Kazakov wrote:
>>>If dispatching tables are associated with functions
>>>then it is the dimension of the table to determine the penalty.
>>>MI, alone, should play no role here, as a guess.
>>
>>In fact MI plays a role on both sides: on the dispacher side because
>>object's offset adjustment must be stored in the cache but also on the
>>call-site side because the address of the object must be passed to the
>>dispatcher to apply this offset adjusment, and this (roughly) double the
>>code generated to dispatch the message.
>
> I wouldn't count it on account of dispatch, because even if the target is
> statically known, you will need to shift the address. In fact, it is type
> conversion penalty.

This is on the side of the dispatcher where I agree, object must be
shifted according to their (most derived) types and the expected types
of the selected method. For the caller side, I am refering to the fact
that the address of the objects have to be passed.

for SI, the dispatch looks like:

return ((generic_type)find_method(generic, obj1, obj2)) (obj1, obj2);

but for MI, the dispatch looks like:

generic_type mth = (generic_type)find_method(generic, &obj1, &obj2);
return mth(obj1, obj2);

where find_method returns a pointer to the selected method to invoke.

The asm code of the latter is about twice as long as asm code of the
former (at least on x86) because both obj and &obj are processed and a
sequence point exists between the find and the call to avoid a call with
unshifted objects. While for SI, the compiler is free to optimize the
all in one shot. Now consider expression like:

res = max3(add(o1,o2),sub(o1,o2),sub(o2,o1));

where add, sub, max3 are messages (generics) and you can get a picture
of what happens on the caller site. Asm code ratio SI / MI is about 3
because o1, o2 are loaded once in the SI case and three times in the MI
case plus extra code to manage required sequence points.

>>I
>>expect that when you say "For safety reasons" you are referring to the
>>constraints applied to parameters (covariance / contravariance of
>>types). But these constraints do not apply to multimethod dynamic
>>dispatch because the overriding mechanism differs from static type
>>systems. For example, as you say, in a static type system like in C++ or
>>Java, it is important that f(B,B), f(B,C), f(C,B) use different slots in
>>the virtual table(s) and do not override each other (limiting somehow
>>the subtyping polymorphism), while in dynamic dispatch they use the same
>>slot: f(B,B) overrides f(B,C) which overrides the less specialized
>>method f(C,B), if we consider my 2nd list of linearization. This is
>>still safe since the dispatcher always propagates the most derived type
>>i.e. A.
>
>
> Hmm, I don't think that there is some special "dynamic" way. Whether f(B,B)
> may override f(B,C) depends solely on the relations between B and C. It
> depends on whether B is a specialization or generalization if C and on the
> parameter mode (in, out or in out.) The problem is that substitutability is
> not checkable either statically or dynamically, so the language should
> rather rely on the programmer here.

No, the fact that B inherits from C is checkable at runtime (but not at
compile time of course). In fact multimethod dynamic dispatch is safer
than say Java interfaces since the method selection is done by the
dispatcher which ensures de facto the correctness of the objects types.
While in Java interface, the programmer has to check the type of the
extra objects passed to the method since only the first one is truly
checked by the method dispatch, the static types of the other objects
being lost (e.g. compareTo, equalTo, lessThan, copyTo).

C++ on his side restricts subtyping polymorphism because static typing
cannot ensure the type correctness for such overriding. This is a
tradeoff between flexibility and safety, but it has parametric
polymorphism to get round this restriction.

Eiffel does not have this restriction, but it needs an overall-program
analysis to ensure the type correctness with 'like' types, or it
introduces some hidden runtime checks like in the Java's generic
interfaces implementation.

regards,

Laurent.