From: nmm1 on
In article <8b4eibF2ppU1(a)mid.individual.net>,
Andrew Reilly <areilly---(a)bigpond.net.au> wrote:
>On Sat, 24 Jul 2010 16:52:22 -0700, MitchAlsup wrote:
>
>> I think what Robert is getting at is that lumping everything under a
>> coherent cache is running into a vonNeumann wall.
>
>Coherence is clearly complicated, but it doesn't seem necessarily to be
>sequential. Are there theoretical limits to how parallelisable coherence
>can be? Is the main issue speed-of-light limits to round-trip
>communication between distributed cache controllers?

Yes, and no, respectively. However, the theoretical limits that I
know of in this area are much weaker than the practical ones.


Regards,
Nick Maclaren.
From: nmm1 on
In article <70ath7-gh8.ln1(a)ntp.tmsw.no>,
Terje Mathisen <"terje.mathisen at tmsw.no"> wrote:
>Robert Myers wrote:
>> Maybe you want more programmable control over coherence domains. If
>> you're not going to scrap cache and cache snooping, maybe you can
>> wrestle some control away from the hardware and give it to the
>> software.
>
>That sounds like software-controlled distributed shared memory, a
>concept that generates a lot more research papers and PhDs than actual
>useful products, at least so far. :-(

I believe that tackling it as a "computer science" problem is a large
part of the reason that it has never got anywhere. The thesis posted
later is a fairly typical example of the better research - let's skip
over the worse research, holding our noses and averting our gaze!
The killer isn't that it wouldn't work. The killer is how to map a
sufficient class of problems to it to make it worthwhile - the three
examples used are all well-known to be easily optimised by a wide
range of architectures (parallel and other). And, like Robert, I
don't see it doing so - AS IT STANDS - it might well be a starting
point for a viable design.

I believe that something COULD be done, but I don't believe that
anything WILL be done for the forseeable future. Benchmarketing and
existing spaghetti code rule too much decision making. Also, as I
have posted ad tedium, the architecture is of little use without
tackling the programming paradigms used.


Regards,
Nick Maclaren.
From: Rick Jones on
Brett Davis <ggtgp(a)yahoo.com> wrote:
> I have never in my life seen a compiler issue a PREFETCH instruction.
> I have several times mocked the usefulness of PREFETCH as implemented
> for CPUs in the embedded market. (Locking up one of the two read
> ports makes good performance impossible without resorting to assembly.)

> I would think that the fetch ahead engine on high end x86 and POWER
> would make PREFETCH just about as useless, except to prime the pump
> at the start of a new data set being streamed in.

> How is PREFETCH used by which compilers today?

Exactly how it is used I do not know, but this:

http://www.spec.org/cpu2006/results/res2010q1/cpu2006-20100301-09740.html

and the linked description of the -opt-prefetch flag:

http://www.spec.org/cpu2006/results/res2010q1/cpu2006-20100301-09740.flags.html#user_CXXbase_f-opt-prefetch

Suggests that compilers to have that feature.

rick jones
--
portable adj, code that compiles under more than one compiler
these opinions are mine, all mine; HP might not want them anyway... :)
feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...
From: nmm1 on
In article <i2kft0$9p7$1(a)usenet01.boi.hp.com>,
Rick Jones <rick.jones2(a)hp.com> wrote:
>Brett Davis <ggtgp(a)yahoo.com> wrote:
>> I have never in my life seen a compiler issue a PREFETCH instruction.
>> I have several times mocked the usefulness of PREFETCH as implemented
>> for CPUs in the embedded market. (Locking up one of the two read
>> ports makes good performance impossible without resorting to assembly.)
>
>> I would think that the fetch ahead engine on high end x86 and POWER
>> would make PREFETCH just about as useless, except to prime the pump
>> at the start of a new data set being streamed in.
>
>> How is PREFETCH used by which compilers today?
>
>Exactly how it is used I do not know, but this:
>
>http://www.spec.org/cpu2006/results/res2010q1/cpu2006-20100301-09740.html
>
>and the linked description of the -opt-prefetch flag:
>
>http://www.spec.org/cpu2006/results/res2010q1/cpu2006-20100301-09740.flags.html#user_CXXbase_f-opt-prefetch
>
>Suggests that compilers to have that feature.

Don't believe everything that you are told! I have set such flags
for several compilers on several architectures, and looked for the
inserted instructions. Sometimes they are inserted, but often not.
My tests included comparing the sizes of a large number of modules
of typical scientific code, and giving them trivial examples which
were ideally suited for the technique.

A suspicious and cynical old sod, aren't I?


Regards,
Nick Maclaren.
From: George Neuner on
On Sun, 25 Jul 2010 10:42:12 +0200, Terje Mathisen <"terje.mathisen at
tmsw.no"> wrote:

>Robert Myers wrote:
>> Maybe you want more programmable control over coherence domains. If
>> you're not going to scrap cache and cache snooping, maybe you can
>> wrestle some control away from the hardware and give it to the
>> software.
>
>That sounds like software-controlled distributed shared memory, a
>concept that generates a lot more research papers and PhDs than actual
>useful products, at least so far. :-(

The hardware controlled version: KSR-1, went belly up.

George