From: John Mashey on

John Mashey wrote:

> c) I spent some time with David Kay (of XEROX PARC/Smalltalk fame) on
> this, i.e., were there features that would be substantially helpful?

Eliot Miranda correctly points out that I had to mean Alan Kay, not
David; sorry Alan. (David Kay's activities are in a rather different
domain).

From: Eliot Miranda on


John Mashey wrote:
> Eliot Miranda wrote:
>
>>John Mashey wrote:
>
>
>>>Longer answer:
>>>a) They said either give them a complete, tailored solution [which they
>>>didn't expect], or just make the CPU run fast, but don't bother with
>>>minor enhancements. Some said they knew about the SPARC feature, but
>>>didn't use it.
>>
>>This would include Peter Deutsch and the design of HPS his 2nd dynamic
>>translation (JIT) VM.
>
>
> Lots of good details deleted...
>
> 1) The suggestions would probably fit HP PA better than MIPs, as it has
> extensive "annul-next-instruction" features.

And I was being dense. Skipping is unnecessary. More natural would be
to set condition flags for overflow if the tags were wrong. In most
dynamic languages there will be a test for overflow following the
addition in cases where its not known if the tags are correct, so one
can fold the tag and overflow test together.

So the instructions would be tagged add/sub with overflow and compare
tagged with overflow. Overflow would be set if the tags didn't match
the tag spec or if the arithmetic overflowed.

Code sequences for polymorphic add/sub or compare would then look like

fetch operand one
fetch operand two
tagged add/sub
branch no overflow Ldone
code for non-tagged & overflow cases (method lookup)
...
Ldone:

--
_______________,,,^..^,,,____________________________
Eliot Miranda Smalltalk - Scene not herd

From: John Mashey on

Eliot Miranda wrote:
> John Mashey wrote:
> Code sequences for polymorphic add/sub or compare would then look like
>
> fetch operand one
> fetch operand two
> tagged add/sub
> branch no overflow Ldone
> code for non-tagged & overflow cases (method lookup)
> ...
> Ldone:

Just out of curiosity, what's known about the dynamic frequencies of
the tweo cases?

From: Terje Mathisen on
John Mashey wrote:

> Eliot Miranda wrote:
>
>>John Mashey wrote:
>>Code sequences for polymorphic add/sub or compare would then look like
>>
>> fetch operand one
>> fetch operand two
>> tagged add/sub
>> branch no overflow Ldone
>> code for non-tagged & overflow cases (method lookup)
>> ...
>>Ldone:
>
>
> Just out of curiosity, what's known about the dynamic frequencies of
> the tweo cases?

It doesn't really matter:

The amount of code needed in the overflow case is so vastly larger than
for regular ("small") ints that it will always be a win to do it this way.

I.e. even if you only had 10% small ints, the savings possible by being
able to handle these with 2-3 inline instructions is enough to make it a
good idea.

In reality, due to the need for things like array and loop indices,
small adds will probably be a _lot_ more common than the bigint kind.

The x86 INTO style single-byte trap would be the best possible way to
handle it as long as you also had the fast user-level trap handler you
wished you had been able to fit into MIPS architecture:

You could compile code to be nearly as compact as for C ints, keeping
good locality etc, while still handling the bigint operations just a
little bit slower than if you called bigint library functions directly.

Terje
--
- <Terje.Mathisen(a)hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"
From: John Mashey on

Terje Mathisen wrote:
> John Mashey wrote:
>
> > Eliot Miranda wrote:
> >
> >>John Mashey wrote:
> >>Code sequences for polymorphic add/sub or compare would then look like
> >>
> >> fetch operand one
> >> fetch operand two
> >> tagged add/sub
> >> branch no overflow Ldone
> >> code for non-tagged & overflow cases (method lookup)
> >> ...
> >>Ldone:
> >
> >
> > Just out of curiosity, what's known about the dynamic frequencies of
> > the tweo cases?
>
> It doesn't really matter:

Well, actually it does (in general, perhaps not in this case), which is
why I asked.
Statistical frequencies *always* matter, at least in the sense that
while it may not matter much if something happens 45 or 50% of the
time, it may matter whether it happens 10% or 1% or .1% or .0000001%.

For example, this whole thread started on overflow. Many vendors chose
to have condition codes and branch-on-overflow, which in fact meant
that the ~100% of executions that did not overflow either paid for
having a bunch of useless branch tests, or else left them out. A few
of us decided that trap-on-overflow was perfectly adequate, even if it
meant that the handling was more complex when it did happen.

Anyway, philosophically, I suspect that in this case you are right,
BUT:
a) Human intuition is notoriously bad about such things.
b) Given a choice of guessing about numbers or {measuring | asking
somebody who knows} I'll take the latter. Guessing is only for when
you just can't get the data in a reasonable cost or time.
c) This class of feature, and the chocie of the specific design turn
out to depend a great deal on:
- The existing ISA, if any, that one is adding onto. Specifically,
branch architectures differ quite a bit, and the approach for this
would likely differ.
- The feature design would also likely depend somewhat on the sorts of
pipeline implementations you expect, say roughly: single-issue,
multiple-issue in-order, and multiple-issue out-of order, and also, the
general approach to branch prediction.

So anyway, what do we actually *know* abotu the relative frequencies?

> The amount of code needed in the overflow case is so vastly larger than
> for regular ("small") ints that it will always be a win to do it this way.
>
> I.e. even if you only had 10% small ints, the savings possible by being
> able to handle these with 2-3 inline instructions is enough to make it a
> good idea.
>
> In reality, due to the need for things like array and loop indices,
> small adds will probably be a _lot_ more common than the bigint kind.
>
> The x86 INTO style single-byte trap would be the best possible way to
> handle it as long as you also had the fast user-level trap handler you
> wished you had been able to fit into MIPS architecture:
>
> You could compile code to be nearly as compact as for C ints, keeping
> good locality etc, while still handling the bigint operations just a
> little bit slower than if you called bigint library functions directly.
>
> Terje
> --
> - <Terje.Mathisen(a)hda.hydro.com>
> "almost all programming can be viewed as an exercise in caching"