From: Brett Davis on
> > Programming hotshots have done so much damage.
>
> Damage?
> That is clean code that is easy to read and understand.
>
> > And they brag about it.
>
> Only one in a hundred programers know an optimizaton like that, for
> half of comp.arch to be that good says good things about comp.arch.
>
> > I watched some doing one-upsmanship while the earth was still being
> > created, and I decided I wanted nothing to do with it. I think I
> > showed good judgment (rare for me).
>
> I thought I was generous giving away top secrets that most everyone
> else hoards.
>
> If you remember the story about the programmer told to add a cheat
> to the blackjack program so the customer would always win...

The Story of Mel, a Real Programmer
http://www.cs.utah.edu/~elb/folklore/mel.html

> I am not him, my code is clear and readable.
>
> Brett ;)
From: nmm1 on
In article <4BCB4C2A.8080601(a)patten-glew.net>,
Andy \"Krazy\" Glew <ag-news(a)patten-glew.net> wrote:
>On 4/18/2010 1:36 AM, nmm1(a)cam.ac.uk wrote:
>> As I have
>> posted before, I favour a heterogeneous design on-chip:
>>
>> Essentially uninteruptible, user-mode only, out-of-order CPUs
>> for applications etc.
>> Interuptible, system-mode capable, in-order CPUs for the kernel
>> and its daemons.
>
>This is almost opposite what I would expect.
>
>Out-of-order tends to benefit OS code more than many user
>codes. In-order coherent threading benefits manly fairly stupid codes
>that run in user space, like multimedia.
>
>I would guess that you are motivated by something like the following:
>
>System code tends to have unpredictable branches, which hurt many OOO
>machines.
>
>System code you may want to be able to respond to interrupts easily. I
>am guessing that you believe that OOO has worse interrupt latency. That
>is a misconception: OOO tends to have better interrupt latency, since
>they usually redirect to the interrupt handler at retirement. However,
>they lose more work.

No, not at all. You are thinking performance - I am thinking RAS.

Trying to get asynchronous and parallel code with a lot of subtle
interactions (which is the case with many kernels) to work at all
is hard; doing it with highly out-of-order CPUs is murder. Most
shared-memory parallel codes (kernel and other) have lots of race
conditions that don't show up because the synchronisation time is
short compared with the time between the critical events.

However, when one application hammers the CPU hard, that can cause
large delays for OTHER threads (including kernel ones). As I said
earlier, I have seen 5 seconds delay in memory consistency. The
result is that you get very low probability, load-dependent,
non-repeatable failures. Ugh.


Regards,
Nick Maclaren.
From: nmm1 on
In article <75d79415-bc83-4989-80c4-610acae0942c(a)12g2000yqi.googlegroups.com>,
MitchAlsup <MitchAlsup(a)aol.com> wrote:
>On Apr 18, 10:03=A0am, Robert Myers <rbmyers...(a)gmail.com> wrote:
>>
>> My assumption, backed by no evidence, is that HP/Intel kept adding
>> "features" to get the architecture to perform as they had hoped until
>> the architecture was sunk by its own features.
>>
>> You think the problem is fundamental. =A0I think the problem is
>> fundamental only because of the way that code is written, in a
>> language that leaves the compiler to do too much guessing for the idea
>> to have even a hope of working at all.
>
>I think the problem was/is fundamentally a political issue with the
>leadership of the design teams, especially in the ability of the
>leadership to say "No, let us not dedicate of expend resources
>investigating that corner of the design space."

Yes and no. That was definitely the cause, but the missing ability
was to ask "Hang on. Is what we are assuming really true?"


Regards,
Nick Maclaren.
From: nmm1 on
In article <3782bf12-b3f5-4003-94a9-0299859358ed(a)y17g2000yqd.googlegroups.com>,
MitchAlsup <MitchAlsup(a)aol.com> wrote:
>On Apr 18, 1:15=A0pm, "Andy \"Krazy\" Glew" <ag-n...(a)patten-glew.net>
>wrote:
>
>> System code tends to have unpredictable branches, which hurt many OOO mac=
>hines.
>
>I think it is easier to think that system codes have so much inherent
>serializations that the efforts applied in doing OoO are "for want"
>and that these great big OoO machines degrade down to just about the
>same performance as the absolutely in-order cousins.
>
>Its a far bigger issue than simple branch mispredictability. Pointer
>chasing into poorly cached data structures is rampant; "dangerous"
>instructions that are inherently serialized; and poor TLB translation
>success rates. Overall, there just is not that much ILP left in many
>of the paths through system codes.

That was the experience in the days of the System/370. User code
got a factor of two better ILP than system code.


Regards,
Nick Maclaren.
From: nmm1 on
In article <8u3s97-9bt2.ln1(a)ntp.tmsw.no>,
Terje Mathisen <"terje.mathisen at tmsw.no"> wrote:
>nmm1(a)cam.ac.uk wrote:
>> Well, yes, but that's no different from any other choice. As I have
>> posted before, I favour a heterogeneous design on-chip:
>>
>> Essentially uninteruptible, user-mode only, out-of-order CPUs
>> for applications etc.
>> Interuptible, system-mode capable, in-order CPUs for the kernel
>> and its daemons.
>
>This forces the OS to effectively become a message-passing system, since
>every single os call would otherwise require a pair of migrations
>between the two types of cpus.
>
>I'm not saying this would be bad though, since actual data could still
>be passed as pointers...

Yup. In my view, interrupts are doubleplus ungood - message passing
is good.

And, of course, the memory could be shared at some fairly close
cache level.



Regards,
Nick Maclaren.