|
From: Laurent Deniau on 13 Mar 2006 06:38 I am writing some C code to support multimethod dispatch and multiple inheritance (aggregation -> Tree only, no DAG) within the context of class-based object model in ISO C99. Technically, no problem arose and I have everything in hand. All the problem I have are design problems (choices). Since I am not very familliar with "behavior/verb oriented programming" in large scale projects, I was wondering if some experienced programmers with this kind of design (CLOS, Dylan, Cecil, Slate, Python 2.3+, Perl6, ...) could answer to some questions while keeping in mind that I would like to keep things as simple as possible from the user point of view: 1- Do you thing that multiple inheritance is useful if multimethod and dynamic inheritance (dynamic change of instance class) are supported? The price to pay for MI vs SI would be a slow down of about 20-40% of message dispatch speed and an increase of about 200-400% of the dispatch code (call-site) plus some more syntax for super-classes calls (call-site) since the simple call-next-method is not enough. Actually, I perform message single-dispatch at the speed of 2.2-2.5 for SI and 2.7-3.2 for MI where 1.0 is the reference speed of a function being called through a pointer. Dispatch code is *not* inlined to avoid code bloat, therefore the asymptote is 2.0. MI would also imply a slight increase of the caches sizes since the objects offset adjustment must also be stored. My feeling is that MI (Tree only) is not essential when multimethod and dynamic inheritance (predicate/state dispatch) are available. The only useful application I foresee is the easy combination of two (nearly orthogonal) class implementations which can also be implemented manually using composition. 2- When using a call-next-method idiom and considering the class hierarchy: A <: B <: C, the possible left-to-right method specialization are lexical order: f(C,C) f(C,B) f(C,A) f(B,B) f(B,A), f(A,A) or lexical+distance order: f(C,C) f(C,B) f(B,C) f(C,A) f(B,B) f(A,C) f(B,A) f(A,B) f(A,A) Which one is the best? Note that even with MI, C3 (or Lin2) linearization do not apply here (Tree only, no DAG). My feeling is that the latter case is theoretically better, but much less intuitive than the former. For example, the transition f(B,B) -> f(A,C) may not be expected by the programmer since he is not necessarily aware of C. Note that it also happens in the C3 linearization (Python 2.3+, Perl6). Thanks for your time. regards, Laurent.
From: Dmitry A. Kazakov on 13 Mar 2006 10:29 On Mon, 13 Mar 2006 12:38:38 +0100, Laurent Deniau wrote: > 1- Do you thing that multiple inheritance is useful if multimethod and > dynamic inheritance (dynamic change of instance class) are supported? > The price to pay for MI vs SI would be a slow down of about 20-40% of > message dispatch speed and an increase of about 200-400% of the dispatch > code (call-site) plus some more syntax for super-classes calls > (call-site) since the simple call-next-method is not enough. What model you have? If dispatching tables are associated with functions then it is the dimension of the table to determine the penalty. MI, alone, should play no role here, as a guess. > My feeling is that MI (Tree only) is not essential when multimethod and > dynamic inheritance (predicate/state dispatch) are available. The only > useful application I foresee is the easy combination of two (nearly > orthogonal) class implementations which can also be implemented manually > using composition. MI is essential for interfaces. Consider some container library defining Sortable, some GUI library defining Printable and some network library defining Streamable etc. Now you want to use all of them for some given A... > 2- When using a call-next-method idiom and considering the class > hierarchy: A <: B <: C, the possible left-to-right method specialization > are lexical order: > f(C,C) f(C,B) f(C,A) f(B,B) f(B,A), f(A,A) > or lexical+distance order: > f(C,C) f(C,B) f(B,C) f(C,A) f(B,B) f(A,C) f(B,A) f(A,B) f(A,A) > Which one is the best? Note that even with MI, C3 (or Lin2) > linearization do not apply here (Tree only, no DAG). > My feeling is that the latter case is theoretically better, but much > less intuitive than the former. For example, the transition f(B,B) -> > f(A,C) may not be expected by the programmer since he is not necessarily > aware of C. Note that it also happens in the C3 linearization (Python > 2.3+, Perl6). I tend to behavioral (substitutability) point of view. When A <: B <: C, then I presume that there exist safe specialization conversions upwards the type hierarchy: B_from_A : A -> B C_from_B : B -> C This allows to create f(C,B) as a composition in the second parameter: f(C,C) o (1 x C_from_B) Here 1 is identity function. f(B,B) is a composition in both parameters: f(C,C) o (C_from_B x C_from_B) For safety reasons, in terms of preferences, it should be so that methods like f(B,B) and f(A,C) be incomparable. Though they should never conflict, in a properly designed types system. When f(B,B), f(B,C), f(C,B) are inherited by A, they are as compositions: f(A,A) = f(B,B) o (B_from_A x B_from_A) f(B,A) = f(B,B) o (1 x B_from_A) f(A,B) = f(B,B) o (B_from_A x 1) f(A,C) = f(B,C) o (B_from_A x 1) f(C,A) = f(C,B) o (1 x B_from_A) These function act as implicit overrides. So A x A goes to B x B. But if f(A,C) gets explicitly overridden, the compiler should require the programmer to do it for all other four variants. If f(B,A) get overridden then the compiler should require to override f(A,B), f(A,A) too. With out parameters (mutators) it goes slightly more complex, because safe donwwards conversions need to be considered as well, but the principle should remain: f(B,B) and f(A,C) never match AxA. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de
From: Laurent Deniau on 13 Mar 2006 12:30 Dmitry A. Kazakov wrote: > On Mon, 13 Mar 2006 12:38:38 +0100, Laurent Deniau wrote: > > >>1- Do you thing that multiple inheritance is useful if multimethod and >>dynamic inheritance (dynamic change of instance class) are supported? >>The price to pay for MI vs SI would be a slow down of about 20-40% of >>message dispatch speed and an increase of about 200-400% of the dispatch >>code (call-site) plus some more syntax for super-classes calls >>(call-site) since the simple call-next-method is not enough. > > > What model you have? Right, I forgot to mention that the dispatch model is based on generics (Dylan, CLOS) at which methods are attached to and uses dynamic method dispatch through cache (global hash table per thread) plus table lookup (cache miss) and does not use late binding through precomputed compressed multiple-entries static tables. > If dispatching tables are associated with functions > then it is the dimension of the table to determine the penalty. > MI, alone, should play no role here, as a guess. In fact MI plays a role on both sides: on the dispacher side because object's offset adjustment must be stored in the cache but also on the call-site side because the address of the object must be passed to the dispatcher to apply this offset adjusment, and this (roughly) double the code generated to dispatch the message. >>My feeling is that MI (Tree only) is not essential when multimethod and >>dynamic inheritance (predicate/state dispatch) are available. The only >>useful application I foresee is the easy combination of two (nearly >>orthogonal) class implementations which can also be implemented manually >>using composition. > > > MI is essential for interfaces. Consider some container library defining > Sortable, some GUI library defining Printable and some network library > defining Streamable etc. Now you want to use all of them for some given > A... Interfaces are related to MI with static typing + late->dynamic binding and are meaningless when you have dynamic method dispatch. Instead of dynamically looking for Sortable and then invoke Sortable.sort on the container, you just send the sort message to the container or check for its support with respondsTo(sort). >>2- When using a call-next-method idiom and considering the class >>hierarchy: A <: B <: C, the possible left-to-right method specialization >>are lexical order: >> f(C,C) f(C,B) f(C,A) f(B,B) f(B,A), f(A,A) >>or lexical+distance order: >> f(C,C) f(C,B) f(B,C) f(C,A) f(B,B) f(A,C) f(B,A) f(A,B) f(A,A) >>Which one is the best? Note that even with MI, C3 (or Lin2) >>linearization do not apply here (Tree only, no DAG). >>My feeling is that the latter case is theoretically better, but much >>less intuitive than the former. For example, the transition f(B,B) -> >>f(A,C) may not be expected by the programmer since he is not necessarily Sorry, I should have written f(B,B)->f(C,A) since the lists above are from the less to the more specialized functions. >>aware of C. Note that it also happens in the C3 linearization (Python >>2.3+, Perl6). > > > I tend to behavioral (substitutability) point of view. When A <: B <: C, > then I presume that there exist safe specialization conversions upwards the > type hierarchy: > > B_from_A : A -> B > C_from_B : B -> C > > This allows to create f(C,B) as a composition in the second parameter: > > f(C,C) o (1 x C_from_B) > > Here 1 is identity function. f(B,B) is a composition in both parameters: > > f(C,C) o (C_from_B x C_from_B) > > For safety reasons, in terms of preferences, it should be so that methods > like f(B,B) and f(A,C) be incomparable. Though they should never conflict, > in a properly designed types system. When f(B,B), f(B,C), f(C,B) are > inherited by A, they are as compositions: > > f(A,A) = f(B,B) o (B_from_A x B_from_A) > f(B,A) = f(B,B) o (1 x B_from_A) > f(A,B) = f(B,B) o (B_from_A x 1) > f(A,C) = f(B,C) o (B_from_A x 1) > f(C,A) = f(C,B) o (1 x B_from_A) Ok. > These function act as implicit overrides. So A x A goes to B x B. But if > f(A,C) gets explicitly overridden, the compiler should require the > programmer to do it for all other four variants. If f(B,A) get overridden > then the compiler should require to override f(A,B), f(A,A) too. > > With out parameters (mutators) it goes slightly more complex, because safe > donwwards conversions need to be considered as well, but the principle > should remain: f(B,B) and f(A,C) never match AxA. I have the feeling that you are focusing on static type checking. I expect that when you say "For safety reasons" you are referring to the constraints applied to parameters (covariance / contravariance of types). But these constraints do not apply to multimethod dynamic dispatch because the overriding mechanism differs from static type systems. For example, as you say, in a static type system like in C++ or Java, it is important that f(B,B), f(B,C), f(C,B) use different slots in the virtual table(s) and do not override each other (limiting somehow the subtyping polymorphism), while in dynamic dispatch they use the same slot: f(B,B) overrides f(B,C) which overrides the less specialized method f(C,B), if we consider my 2nd list of linearization. This is still safe since the dispatcher always propagates the most derived type i.e. A. Thanks for your answer. regards, Laurent.
From: Dmitry A. Kazakov on 14 Mar 2006 11:27 On Mon, 13 Mar 2006 18:30:03 +0100, Laurent Deniau wrote: > Dmitry A. Kazakov wrote: >> On Mon, 13 Mar 2006 12:38:38 +0100, Laurent Deniau wrote: >> >> What model you have? > > Right, I forgot to mention that the dispatch model is based on generics > (Dylan, CLOS) at which methods are attached to and uses dynamic method > dispatch through cache (global hash table per thread) plus table lookup > (cache miss) and does not use late binding through precomputed > compressed multiple-entries static tables. I see. >> If dispatching tables are associated with functions >> then it is the dimension of the table to determine the penalty. >> MI, alone, should play no role here, as a guess. > > In fact MI plays a role on both sides: on the dispacher side because > object's offset adjustment must be stored in the cache but also on the > call-site side because the address of the object must be passed to the > dispatcher to apply this offset adjusment, and this (roughly) double the > code generated to dispatch the message. I wouldn't count it on account of dispatch, because even if the target is statically known, you will need to shift the address. In fact, it is type conversion penalty. > Interfaces are related to MI with static typing + late->dynamic binding > and are meaningless when you have dynamic method dispatch. I agree. MI and interfaces (as a case of MI) have sense only if you want to statically constrain classes (set of types.) If there is only one class = everything is derived from one root and everything is treated as such, then MI would make little sense. >> These function act as implicit overrides. So A x A goes to B x B. But if >> f(A,C) gets explicitly overridden, the compiler should require the >> programmer to do it for all other four variants. If f(B,A) get overridden >> then the compiler should require to override f(A,B), f(A,A) too. >> >> With out parameters (mutators) it goes slightly more complex, because safe >> donwwards conversions need to be considered as well, but the principle >> should remain: f(B,B) and f(A,C) never match AxA. > > I have the feeling that you are focusing on static type checking. Right. To me the dynamic case is rather uninteresting, at least it is not so challenging... (:-)) > I > expect that when you say "For safety reasons" you are referring to the > constraints applied to parameters (covariance / contravariance of > types). But these constraints do not apply to multimethod dynamic > dispatch because the overriding mechanism differs from static type > systems. For example, as you say, in a static type system like in C++ or > Java, it is important that f(B,B), f(B,C), f(C,B) use different slots in > the virtual table(s) and do not override each other (limiting somehow > the subtyping polymorphism), while in dynamic dispatch they use the same > slot: f(B,B) overrides f(B,C) which overrides the less specialized > method f(C,B), if we consider my 2nd list of linearization. This is > still safe since the dispatcher always propagates the most derived type > i.e. A. Hmm, I don't think that there is some special "dynamic" way. Whether f(B,B) may override f(B,C) depends solely on the relations between B and C. It depends on whether B is a specialization or generalization if C and on the parameter mode (in, out or in out.) The problem is that substitutability is not checkable either statically or dynamically, so the language should rather rely on the programmer here. But this also means that it should warn the programmer about potentially suspicious cases. Especially for multimethods, I had truly horror stories. One was related to malfunctioned "<" and "=" in a deep types hierarchy. It broke down containers in a very nasty way. Such errors are extremely difficult to track down. So, yes, I do think it must be static, though I cannot tell how... (:-() -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de
From: Laurent Deniau on 14 Mar 2006 12:37
Dmitry A. Kazakov wrote: > On Mon, 13 Mar 2006 18:30:03 +0100, Laurent Deniau wrote: >>Dmitry A. Kazakov wrote: >>>If dispatching tables are associated with functions >>>then it is the dimension of the table to determine the penalty. >>>MI, alone, should play no role here, as a guess. >> >>In fact MI plays a role on both sides: on the dispacher side because >>object's offset adjustment must be stored in the cache but also on the >>call-site side because the address of the object must be passed to the >>dispatcher to apply this offset adjusment, and this (roughly) double the >>code generated to dispatch the message. > > I wouldn't count it on account of dispatch, because even if the target is > statically known, you will need to shift the address. In fact, it is type > conversion penalty. This is on the side of the dispatcher where I agree, object must be shifted according to their (most derived) types and the expected types of the selected method. For the caller side, I am refering to the fact that the address of the objects have to be passed. for SI, the dispatch looks like: return ((generic_type)find_method(generic, obj1, obj2)) (obj1, obj2); but for MI, the dispatch looks like: generic_type mth = (generic_type)find_method(generic, &obj1, &obj2); return mth(obj1, obj2); where find_method returns a pointer to the selected method to invoke. The asm code of the latter is about twice as long as asm code of the former (at least on x86) because both obj and &obj are processed and a sequence point exists between the find and the call to avoid a call with unshifted objects. While for SI, the compiler is free to optimize the all in one shot. Now consider expression like: res = max3(add(o1,o2),sub(o1,o2),sub(o2,o1)); where add, sub, max3 are messages (generics) and you can get a picture of what happens on the caller site. Asm code ratio SI / MI is about 3 because o1, o2 are loaded once in the SI case and three times in the MI case plus extra code to manage required sequence points. >>I >>expect that when you say "For safety reasons" you are referring to the >>constraints applied to parameters (covariance / contravariance of >>types). But these constraints do not apply to multimethod dynamic >>dispatch because the overriding mechanism differs from static type >>systems. For example, as you say, in a static type system like in C++ or >>Java, it is important that f(B,B), f(B,C), f(C,B) use different slots in >>the virtual table(s) and do not override each other (limiting somehow >>the subtyping polymorphism), while in dynamic dispatch they use the same >>slot: f(B,B) overrides f(B,C) which overrides the less specialized >>method f(C,B), if we consider my 2nd list of linearization. This is >>still safe since the dispatcher always propagates the most derived type >>i.e. A. > > > Hmm, I don't think that there is some special "dynamic" way. Whether f(B,B) > may override f(B,C) depends solely on the relations between B and C. It > depends on whether B is a specialization or generalization if C and on the > parameter mode (in, out or in out.) The problem is that substitutability is > not checkable either statically or dynamically, so the language should > rather rely on the programmer here. No, the fact that B inherits from C is checkable at runtime (but not at compile time of course). In fact multimethod dynamic dispatch is safer than say Java interfaces since the method selection is done by the dispatcher which ensures de facto the correctness of the objects types. While in Java interface, the programmer has to check the type of the extra objects passed to the method since only the first one is truly checked by the method dispatch, the static types of the other objects being lost (e.g. compareTo, equalTo, lessThan, copyTo). C++ on his side restricts subtyping polymorphism because static typing cannot ensure the type correctness for such overriding. This is a tradeoff between flexibility and safety, but it has parametric polymorphism to get round this restriction. Eiffel does not have this restriction, but it needs an overall-program analysis to ensure the type correctness with 'like' types, or it introduces some hidden runtime checks like in the Java's generic interfaces implementation. regards, Laurent. |