From: Robert Myers on
I am struck by the success of a relatively clueless strategy in computer
microarchitecture:

If it worked more than once, it will probably work again. The only
possible improvement is to learn to anticipate exceptions.

I'd call that bottom up prediction.

Most top-down prediction schemes (I think I understand the computer and
the program and I'll tell the computer what to expect) have been
relative failures (most notoriously: Itanium).

The top-down approach has been a similar disappointment in artificial
intelligence: I understand logic and I'll teach the computer how to do
what I think I know how to do.

If your instinct at this point is to dash off a post that I'm simply
over-generalizing, save yourself the time and stop reading here.

Most biological control systems and especially central nervous systems
have severe and apparently insurmountable latency problems. In such a
situation, the only possible strategy is to learn to anticipate. One
justification for watching sports closely (especially basketball, where
anticipation occurs in less than the blink of an eye) is to observe just
how good biological systems are at anticipation.

First, let me bow obsequiously to the computer architects and those who
support them who have just kept trying things until they found things
that actually worked. Thomas Edison would be proud.

My if-it-worked-more-than-once-it-will-probably-work-again machinery
thinks it sees a pattern:

1. Anticipation is central to (almost) all cognitive and computational
processes.

2. Anticipation cannot be anticipated. It must be learned.

3. The agenda of computer architecture, which is far from dead, is to
keep working the if-it-worked-more-than-once-it-will-probably-work-again
angle harder and harder. That is, after all, practically all the human
central nervous system knows how to do (I think).

If all of this seems patently obvious, feel free to say so. It has only
slowly dawned on me.

Robert.
From: Torben �gidius Mogensen on
nmm1(a)cam.ac.uk writes:

> In article <5R4Yn.6004$cO.5854(a)newsfe09.iad>,
> Robert Myers <rbmyersusa(a)gmail.com> wrote:
>>Terje Mathisen wrote:
>>
>>> Experience has shown us that dynamic (i.e. learning) predictors perform
>>> far better than static compiler hints. :-)
>>
>>Was obvious to all right along, no? ;-)
>
> Well, yes and no. I don't actually think that it is right, as a
> universal rule.
>
> The point is that it has been known since time immemorial that
> highly disciplined languages enable much better static prediction.
> It always was obvious that static prediction was a clear loser in
> the sort of C/C++/etc. spaghetti that dominates so many areas at
> present. What wasn't clear (to me, at least) is whether dynamic
> prediction would be enough better to be worth the extra complexity.
> Well, the answer is that it is ....
>
> Also, all of the current approaches to static prediction that I
> know of seem to have been designed by hardware people living in
> an ivory tower that hasn't had any contact with the programmers
> since about 1960. In particular, using the program locality as
> the primary key is obviously inane, once you start using an
> 'object-oriented' approach.

OO is notoriously difficult to statically analyse since it has a very
dynamic execution model. If you add reflection and dynamic class
loading, it gets even worse.

The most common approach to optimising OO programs are, hence, either to
force things to be more static by requiring final declarations and such
or by an approach where you first compile using the assumption that
things are static and then recompile if this doesn't turn out to be
true. With JIT compilers, recompilation can happen several times during
runtime.

While this can work (and does), it is hugely complicated and all just to
support a programming paradigm that has questionable benefits (and many
pitfalls).

> I think that we could do a lot better if we approached static
> prediction more intelligently, even with current programs, and
> potentially better even than current dynamic prediction with
> improved programming paradigms.

Quite so. Dynamic prediction has the disadvantage that it (for
practical reasons) can only look at a small fraction of the execution
history (and only the past). Static prediction can model the entire
execution, past present and future, albeit with some approximations (due
to incomputability), so it can, theoretically, make better predictions.
Even better, you can have languages that guarantee certain properties,
so you don't even have to predict these.

But, overall, I would guess a combination of static and dynamic
prediction will be the best solution. This way, you can combine exact
knowledge of recent history with inexact knowledge of the complete
execution to get better results that any of these can do on its own.

Torben

From: nmm1 on
In article <7zr5jie4mm.fsf(a)ask.diku.dk>,
Torben �gidius Mogensen <torbenm(a)diku.dk> wrote:
>
>OO is notoriously difficult to statically analyse since it has a very
>dynamic execution model. If you add reflection and dynamic class
>loading, it gets even worse.

Precisely.

>The most common approach to optimising OO programs are, hence, either to
>force things to be more static by requiring final declarations and such
>or by an approach where you first compile using the assumption that
>things are static and then recompile if this doesn't turn out to be
>true. With JIT compilers, recompilation can happen several times during
>runtime.

Horrible hacks, both of them :-(

>While this can work (and does), it is hugely complicated and all just to
>support a programming paradigm that has questionable benefits (and many
>pitfalls).

Agreed. However, the disadvantages are primarily when it is used for
a problem that is not naturally objected-oriented - for one that is,
it's benefits are fairly clear. Surprise, surprise. Exactly the same
is true of most other paradigms (mutis mutandis) ....

>Even better, you can have languages that guarantee certain properties,
>so you don't even have to predict these.

Which, as you may know, is one of my hobby-horses! That is a key,
not just to performance, but to the RAS of large programs. Modern
Fortran has quite a lot of that.

>But, overall, I would guess a combination of static and dynamic
>prediction will be the best solution. This way, you can combine exact
>knowledge of recent history with inexact knowledge of the complete
>execution to get better results that any of these can do on its own.

Agreed. As I have posted (repeatedly), there is a much simpler
approach to branch prediction for objected-oriented codes, where
the compiler indicates the object register and which branches are
likely to be associated with which others. As far as I know, that
has never been investigated, and I would be interested to hear if
it has.


Regards,
Nick Maclaren.
From: jacko on
<snip>

> Agreed.  As I have posted (repeatedly), there is a much simpler
> approach to branch prediction for objected-oriented codes, where
> the compiler indicates the object register and which branches are
> likely to be associated with which others.  As far as I know, that
> has never been investigated, and I would be interested to hear if
> it has.
>
> Regards,
> Nick Maclaren.

As well as the object register, it is also useful to mark a threading
register and remove all the jsr/call instructions (smaller code is
lower cache load). The indirection jsr or call with the address held
in a register problem is partially solved by mainly just stacking it
to the return stack, and testing the contents of any new IP fetch
against zero, and either stack and new IP on not zero or jmp IP+1 some
machine code is then executed and untstack IP and jmp threader.

This is a tight loop with a small fixed number of static branches, and
the jmp IP+1 can be effected by doing the stacking always, and doing
jmp (SP++) = RET/RTS. A call to the treading routine at the end of the
code!

A seperate evaluation stack would be useful in another register.

Cheers Jacko
From: jacko on
<snip>

> Agreed. As I have posted (repeatedly), there is a much simpler
> approach to branch prediction for objected-oriented codes, where
> the compiler indicates the object register and which branches are
> likely to be associated with which others. As far as I know, that
> has never been investigated, and I would be interested to hear if
> it has.
>
> Regards,
> Nick Maclaren.

As well as the object register, it is also useful to mark a threading
register and remove all the jsr/call instructions (smaller code is
lower cache load). The indirection jsr or call with the address held
in a register problem is partially solved by mainly just stacking it
to the return stack, and testing the contents of any new IP fetch
against zero, and either stack and new IP on not zero or jmp IP+1 some
machine code is then executed and untstack IP and jmp threader.

This is a tight loop with a small fixed number of static branches, and
the jmp IP+1 can be effected by doing the stacking always, and doing
jmp (SP++) = RET/RTS. A call to the treading routine at the end of the
code!

A seperate evaluation stack would be useful in another register.

Cheers Jacko

http://acebforth.googlecode.com is coming along with the subroutined
primitive model.