From: Anton Ertl on
"Andy \"Krazy\" Glew" <ag-news(a)patten-glew.net> writes:
>mainframes and minicomputers. It's not clear if we ever really
>surpassed them - while I think that the most recent members of the Intel
>P6 family surpassed the most advanced IBM mainframe processors people
>rumor about Poughkeepsie, I'm not sure.

That probably depends on what you use for comparison.

If you use SPEC CPU benchmarks, I think you passed them quite a while
ago; why do I think so?

1) If they were ahead, they would post SPEC CPU numbers.

2) What I read about their CPUs did not sound like they would excel at
this benchmark. At the start of 2003 they had the 917MHz single-issue
z900, while Intel had started shipping superscalar processors in 1993,
had reached 1GHz in 2000, and was at 3GHz at that time. Then in 2003
IBM introduced the z990 at 1.2GHz (don't know about the issue width).
In recent years they have taken a similar implementation approach to
the Power 6 (high clock rates, in-order superscalar); they are
probably more conservative than Power 6 and therefore will not exceed
the Power 6 performance (which is about comparable per core to the
Xeons on the SPEC CPU rate benchmarks).

If, OTOH, you try to compare on the mainframes' terms (i.e., running
IBM mainframe software at high reliability), there is no contest.

>Mike Haertel says that the big value of Atom was allowing Intel to take
>a step back, and then get onto the Moore's Law curve again, for a few years.

Intel CPU dies double the transistor count pretty regularly, so they
seem to be on the Moore's law curve. The Atom has less transistors
per die than other Intel CPUs, so it will be a little off the Moore's
law curve.

If you are thinking about the non-Moore's-law of clock rate doubling
at every process shrink, do you think that we will see that again with
Atom?

- anton
--
M. Anton Ertl Some things have to be seen to be believed
anton(a)mips.complang.tuwien.ac.at Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html
From: nmm1 on
In article <KuOdnZwrfuZmbV_XnZ2dnUVZ_g-dnZ2d(a)giganews.com>,
Andy \"Krazy\" Glew <ag-news(a)patten-glew.net> wrote:
>Mayan Moudgill wrote:
>
>> There is less low-hanging fruit around; most of the simpler and
>> obviously beneficial ideas are known, and most other ideas are more
>> complex and harder to explain/utilize.
>
>I believe that there are good new ideas, in both single processor
>microarchitecture, and in multiprocessor.
>
>But we are in a period of retrenchment - one of the downward zigs of the
>"sawtooth wave" that I described in my Stanford EE380 talk so many years
>ago.

And have been for a LONG time :-(

All right, at the low-level (microarchitecture) front, that is not
true - that is where the action has been, and I am not denying that
some of it has been impressive. Whether it has been a good idea is
another matter ....

But, at the higher levels, (i.e. the program-visible functionality),
we have seen very, very little innovation since the 1980s. What
there has been, has almost always been reintroducing established
mainframe technologies onto microprocessors.


Regards,
Nick Maclaren.
From: nmm1 on
In article <4AC2F581.7060508(a)patten-glew.net>,
Andy \"Krazy\" Glew <ag-news(a)patten-glew.net> wrote:
>Tim McCaffrey wrote:
>> In article
>> <da524b6d-bc4d-4ad7-9786-3672f7e9e52c(a)j19g2000yqk.googlegroups.com>,
>> MitchAlsup(a)aol.com says...
>>> On Sep 10, 10:04=A0pm, Mayan Moudgill <ma...(a)bestweb.net> wrote:
>>>> Well, synchronization can be pretty easy to implement - depends on what
>>>> you are trying to accomplish with it (barriers, exclusion, queues,
>>>> etc.).
>>> If it is so easy to implement then why are (almost) all
>>> synchronization models at lest BigO( n**2 ) in time? per unit of
>>> observation. That is, it takes a minimum of n**2 memory accesses for 1
>>> processor to recognize that it is the processor that can attempt to
>>> make forward progress amongst n contending processors/threads.
>
>Although my MS thesis was one of the first to make this observation of
>O(n^2) work, it also points out that there are O(1) algos, chiefly among
>the queue based locks. I liked Graunke Thakkar, but MCS gets the acclaim.

When was that? I have no idea when or where I first heard it, but
it was common knowledge by the early 1970s and (from some of the
contexts I heard it in) must have been by the mid-1960s at the very
latest. One of those contexts was the shared disk/database arena,
which was very active in the late 1960s and early 1970s.

Whether anyone actually wrote it down is another matter entirely!


Regards,
Nick Maclaren.
From: "Andy "Krazy" Glew" on
nmm1(a)cam.ac.uk wrote:
> In article <KuOdnZwrfuZmbV_XnZ2dnUVZ_g-dnZ2d(a)giganews.com>,
> Andy \"Krazy\" Glew <ag-news(a)patten-glew.net> wrote:
>> But we are in a period of retrenchment - one of the downward zigs of the
>> "sawtooth wave" that I described in my Stanford EE380 talk so many years
>> ago.
>
> And have been for a LONG time :-(
>
> All right, at the low-level (microarchitecture) front, that is not
> true - that is whe.re the action has been, and I am not denying that
> some of it has been impressive. Whether it has been a good idea is
> another matter ....
>
> But, at the higher levels, (i.e. the program-visible functionality),
> we have seen very, very little innovation since the 1980s. What
> there has been, has almost always been reintroducing established
> mainframe technologies onto microprocessors.

I tend to agree with you Nick. (See, it happens.)

I might not be so sweeping as you. I think that there has been some
innovation in, e.g., programming languages. I think that techniques from
Literate Programming / Aspect Oriented Programming, etc., have hopes of
making it easier for programmers to express and manage parallelism. I
think that the introduction of generics - e.g. C++ templates and the STL
- may be one of the most important things in the medium term.

But the pace of such introductions is slow and, I suspect, generational.
The roots of these technologies can often be traced back into earlier
eras.

Now, you may not have been talking about programming languages. You may
have been talking about, e.g. programmer visible hardware techniques
like synchronization primitives. Again, glacial progress. In fact, I
may share some blame: I have often said "Don't add <some Joe Ph.D.'s
great idea wrt synchronization> to the instruction set, until we have
some experience with many-processor systems for the mass market, to know
that the bottleneck he solves is real." But back in 1991 we expected
dual cores to dominate the desktop by 1998, and we expected integrated
4-cores to be uniquitous by 2000. Who knew that it was going to take so
long? Who knew that frequency and core bloat would delay multicore by 10
years? Mainly, who knew that the computer industry could only handle
evolution on one front at a time: OOO, VLIW, high frequency, and then
finally multicore?

Nevertheless, I still think that conservativism in introducing new ISA
features is not necessarily bad. How often does somebody propose a new
ISA feature, like Vernon/Goodman's QOLB instructions, only to have
somebody like Alain Kagi come along a few years later and show that you
can avoid the problem without a new ISA, via delays?

Here's a Glew Rule for new instructions: try to think of 3 different
levels of implementation:
(1) the aggressive implementation - reasonable hardware/microcode,
best performance
(2) the moderate implementation - what you can hope to get on first
implementation
(3) the cheap implementation - how expensive is it if you get no
hardware support, and you have to do it in microcode and/or trap to
software?

Too many ISA proposals (a) turn out not to justify the hardware of (2)
or (3) in all situations, and (b) make the cheap implementation worse
than doing nothing at all. And if the market segments that need the
cheap implementation are your growth areas...

This is the ISA designer's equivalent of "First, do no harm."
From: "Andy "Krazy" Glew" on
Anton Ertl wrote:
> "Andy \"Krazy\" Glew" <ag-news(a)patten-glew.net> writes:
>> mainframes and minicomputers. It's not clear if we ever really
>> surpassed them - while I think that the most recent members of the Intel
>> P6 family surpassed the most advanced IBM mainframe processors people
>> rumor about Poughkeepsie, I'm not sure.
>
> That probably depends on what you use for comparison.
>
> If you use SPEC CPU benchmarks, I think you passed them quite a while
> ago;

Actually, I am thinking in terms of non-quantified "microarchitecture
sophistication". E.g. is the OOO microarchitecture of P6 in 1995, or
Nhm in 2009, really a significant advance over Tomasulo, let alone FS?
Is our memory ordering implementation really more sophisticated that an
IBM implementation of Freye's rule?


>> Mike Haertel says that the big value of Atom was allowing Intel to take
>> a step back, and then get onto the Moore's Law curve again, for a few years.
>
> Intel CPU dies double the transistor count pretty regularly, so they
> seem to be on the Moore's law curve. The Atom has less transistors
> per die than other Intel CPUs, so it will be a little off the Moore's
> law curve.
>
> If you are thinking about the non-Moore's-law of clock rate doubling
> at every process shrink, do you think that we will see that again with
> Atom?

I am properly chastized. Several "laws" here:

a) increase in numbers of transistors

b) increase in transistor speed

c) improved power/energy per transistor/switching event

d) increase in transistors per processor

e) increase in single thread performance

f) increase in multithread / multiprocess(or) performance (throughput)