From: Terje Mathisen on 21 Oct 2009 09:40
> I caught the first 5 or 10 minutes of some physics character on tv a few
> days ago. The sort of pop science that the beeb do so well, with the
> dramatic music, graphics and odd angle shots of the presenter trying to
> look wise and intelligent. His thesis was that we would soon have
> intelligence in everything, with billions of microprocessors. My first
> reaction was who will write all the code for this ?, as there's already
> a shortage of embedded programmers who really know what they are doing.
This one is just as obvious today as it was (at least to SF writers and
readers) 30 years ago:
Computers will write the code of future computers, we just need to pass
the singularity where AI actually starts to match human intelligence.
From that point on things will get very interesting, possibly in the
chinese sense. :-)
I still believe it is possible we will reach that singularity within my
lifetime, we're not missing that many orders of magnitude now.
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"
From: "Andy "Krazy" Glew" on 21 Oct 2009 09:42
Andrew Reilly wrote:
> On Mon, 19 Oct 2009 20:40:39 -0700, Andy \"Krazy\" Glew wrote:
>> Andrew Reilly wrote:
>>> Isn't it the case, though, that for most of that "popular software"
>>> speed is a non-issue?
>> I've been manipulating large Excel spreadsheets.
> well, there's your problem ;-)
> I've never got the hang of spreadsheets, and never found a problem that
> didn't look like more of a job for awk or matlab, or even a real program
> of some sort. I guess that there must be some (or at least users who
> think differently than I): it's certainly popular.
>> Minutes-long recalcs.
>> I'm reasonably sure it's computation, and not disk.
>> Algorithms trump hardware, nearly every time.
Possibly interesting proto-thought:
I'm using Excel because the team I'm working with uses Excel. And
because there are some useful features, usually user-interfacey, that
are hard to get access to via other means. But mainly because of
putative "ease of use".
Observation: there are situations where the "Excel way" leads to
sub-optimal algorithms. O(N^2) instead of O(N), in two examples I have
To avoid such problems one must leave Excel and resort to VBA or some
other programming language. I'm willing to do that, but others may not be.
I wonder if other "ease of use" facilities similarly lead to situations
of suboptimal algorithms.
Because of the sub-optimal algorithms that Excel encourages, people are
encouraged NOT to increase problem size, since O(N^2) bites them more
So we get a subculture of Excel users, forced to deal with smaller
problem sizes because of scale-out issues. Versus a subculture of
programmers, that can deal with larger problems, but which doesn't have
the "ease of use" of Excel for smaller problems.
Who has the competitive advantage?
Probably the hybrid subculture that has a small number of hackers
provide outside-of-Excel scalability to the Excel users.
From: "Andy "Krazy" Glew" on 21 Oct 2009 09:56
Robert Myers wrote:
> On Oct 21, 2:21 am, "Andy \"Krazy\" Glew" <ag-n...(a)patten-glew.net>
>> All of the modern OOO machines are dynamic dataflow machines in their
>> hearts. ...
>> I look forward to slowly, incrementally, increasing the scope of the
>> dataflow in OOO machines....
>> My vision is of static dataflow nodes being instantiated several times
>> as dynamic dataflow.
My vision requires more transistors. I.e. my vision can take advantage
of more transistors. But this means more power (a) only if they are
switching, and (b) only if they are leaking.
We know how to make non-leaky devices. So long as they are the same
size, just slower, and so long as we are in the exponential part of the
leakage curve - 10% slower => 10X less leakage - we can take advantage
of them for big machines.
At some point we run into the problem that less leaky devices are
bigger. But, again, its the shape of the curve that matters.
Since I postulate the square law tradeoff - performance varies as the
square root of the number of devices - so long as the leakage is below
the square root line, we can add extra transistors to increase
performance without running into the leakage based power wall.
Plus, I must admit that I have some hope that there are "miraculous
technologies" that have much better leakage. I have encountered some in
the literature, but I must admit that I do not know how practical they
are. Better folks than me say they are.
As for the dynamic power, similar considerations: the total amount of
charge switched per cycle must stay below the square root curve. More
precisely, the total amount of charge*frequency*voltage must stay below
the square root curve.
This leads us into a place where we want to take advantage of a large
number of slow devices. There are two ways to do this: (a)
everything slow - lots of simple, slow, MIMD cores, and (b) a very small
amount of fast logic in a sea of slow logic. Possibly centralized; more
likely distributed like raisins in rice pudding. In the latter case we
are at MIMD again: the only difference is whether the core is all slow,
or has some fast transistors.
From: nmm1 on 21 Oct 2009 10:07
In article <O6idnYddX77NkkLXnZ2dnUVZ8sGdnZ2d(a)lyse.net>,
Terje Mathisen <Terje.Mathisen(a)tmsw.no> wrote:
>This one is just as obvious today as it was (at least to SF writers and
>readers) 30 years ago:
>Computers will write the code of future computers, we just need to pass
>the singularity where AI actually starts to match human intelligence.
> From that point on things will get very interesting, possibly in the
>chinese sense. :-)
>I still believe it is possible we will reach that singularity within my
>lifetime, we're not missing that many orders of magnitude now.
Oh, Terje, you are slipping!
Firstly, that's not a singularity, as there is no reason to believe
in infinite expansion in a strictly finite time. And Vinge should
know better, too :-(
Secondly, the problem isn't capacity, but capability. We have had
both programs that write programs and self-adapting programs for
several decades. But we don't know how to design them so that they
can write programs more capable (roughly, 'more complex') than
themselves, or even AS capable but different. All we know is how
to design them to write simpler, but larger, programs.
From: "Andy "Krazy" Glew" on 21 Oct 2009 10:13
eternal september wrote:
> Hello all,
> "Andy "Krazy" Glew" <ag-news(a)patten-glew.net> wrote in message
>> I look forward to slowly, incrementally, increasing the scope of the
>> dataflow in OOO machines.
>> * Probably the next step is to make the window bigger, by
>> multilevel techniques.
> What is your favorite multilevel technique. I don't think I ever heard
> your opinion on HSW (Hierarchical Scheduling Windows
I'll reread the HSW papers and get back to comp.arch.
In the meantime, you can get an idea of my favorite, dating back to
2004, when I was briefly "free", between leaving AMD and rejoining
Intel. Look for patents by me, that are not assigned to AMD or Intel.
Here, from my CV:
June 2004-August 30, 2004: Inventor, Oceanside, Oregon
The first time in years that my inventions have been owned by me, not my
employers, Intel or AMD
Formulated MultiStar, a multicluster/multithreaded microarchitecture
with multilevel everything: multilevel branch predictor, scheduler,
instruction window, store buffer, register file
Worked around and invented alternatives to proprietary (patented or
trade secret) technology
I.e. I violated no Intel or AMD I.P. This was all new stuff.
Invented solutions to several problems I had been trying to solve for
Even now (in 2009) it is still leading edge.
Patents applied for, with Centaurus Data LLC, include:
20080133893 HIERARCHICAL REGISTER FILE
20080133889 HIERARCHICAL INSTRUCTION SCHEDULER
20080133885 HIERARCHICAL MULTI-THREADING PROCESSOR
20080133883 HIERARCHICAL STORE BUFFER
20080133868 METHOD AND APPARATUS FOR SEGMENTED SEQUENTIAL STORAGE
Some of these patents (in application) have already been licensed by
Fortune 500 companies.
Did not make much progress on my book, apart from 70 pages of draft
I called this Multi-Star - a processor with Multi-level everything.
Of course I had to avoid Intel or AMD I.P. in multi-star, so I am not
always using the best known way to do things.
There are probably a few more patent applications and continuations. I
was obviously not able to work on this outside-Intel stuff when I was an
Intel employee. And Intel forbade me from working on OOO CPUs inside
Intel when I reminded them I had patents pending (which I had disclosed
to them when I was hired). I left Intel in part because I want to
continue work in these directions.
However, not so much free time at new job. At least now I am able to
talk about them on comp.arch. And I hope to summarize the key points
for comp.arch. Here's a hint: at least one of the ideas/inventions
pertaining to hierarchical instruction scheduling is, IMHO, one of the
best I have ever had.
Note: I'm not just an OOO bigot. I also have neat ideas about parallel
systems, MP, MIMD, Coherent Threading. But I am probably the most
aggressive OOO computer architect on Earth.
This is why I get frustrated when people say "OOO CPU design has run
into a wall, and can't improve: look at Intel and AMD's latest designs".
I know how to improve things - but I was not allowed to work on it at
Intel for the last 5 years. Now I can - in my copious free time.