Is it time to stop research in Computer Architecture ? [Computer Architecture]

Prev: Travelling, with Power Supplies
Next: Why do every need I-Cache ?

From: Robert Myers on 21 Oct 2009 22:05

On Oct 21, 9:44 pm, Bill Todd <billt...(a)metrocast.net> wrote:
> Robert Myers wrote:
> > On Oct 21, 8:16 pm, Bill Todd <billt...(a)metrocast.net> wrote:
> >> Robert Myers wrote:
> >>> On Oct 21, 6:08 am, Bill Todd <billt...(a)metrocast.net> wrote:
> >>>> Once again that's irrelevant to the question under discussion here:
> >>>> whether Terje's statement that Merced "_would_ have been, by far, the
> >>>> fastest cpu on the planet" (i.e., in some general sense rather than for
> >>>> a small cherry-picked volume of manually-optimized code) stands up under
> >>>> any real scrutiny.
> >>> I think that Intel seriously expected that the entire universe of
> >>> software would be rewritten to suit its ISA.
> >>> As crazy as that sounds, it's the only way I can make sense of Intel's
> >>> idea that Itanium would replace x86 as a desktop chip.
> >> Did you forget that the original plan (implemented in Merced and I'm
> >> pretty sure McKinley as well) was to include x86 hardware on the chip to
> >> run existing code natively?
>
> > I never took that capability seriously. Was I supposed to?
>
> Why not? It ran x86 code natively in an integrated manner on a native
> Itanic OS. As with most things Merced the original cut wasn't
> impressive in terms of speed, but the relative sizes of the x86 and
> Itanic processors (especially given the amount of the chip area
> dedicated to cache) made it clear that full-fledged x86 cores could be
> included later if necessary as soon as the next process generations
> appeared.
>
The die area may have been available, but I don't think the watts
were. It's hard to remember with any accuracy what I knew when, but
it's pretty easy to tell at least some of what Intel knew. By the
second half of the nineties, Intel knew and briefed that power was
going to be a big problem.

If you're having a hard time competing with AMD on x86 and you're
having a hard time jamming Itanium into an acceptable power envelope,
then I don't see how an onboard x86 could have been anything more than
"well at least we can run our old code, even if slowly."

That is to say, if x86 performance had been important, AMD would have
won and Itanium never would have lifted off. If you could have idled
one processor and run just one at a time, maybe that would have
worked, but I don't think Intel's power management technology was
ready for that (even though they could probably make it work today).

Robert.

From: "Andy "Krazy" Glew" on 21 Oct 2009 22:25

Stephen Fuld wrote:
> Does the way your spreadsheet works force serial calculations? I.e. are
> almost all the cells that are to be recalculated dependent upon the
> previous one, thus forcing a serial chain of calculations. Or are there
> "multiple chains of dependent cells" that are only serial due to the
> way Excel itself is programmed? If the latter, one could enhance Open
> Office to use multiple threads for the recalcs which would take
> advantage of multiple cores for something useful.

No. Could be parallel.

But, it is not really a case of parallelization. In this case "the
Excel way" gives O(N^2) work, whether parallel or non, versus O(N) (or
maybe O(N log N), depending on assumptions), whether parallel or not.

From: Stephen Sprunk on 21 Oct 2009 22:27

nmm1(a)cam.ac.uk wrote:
>>> There are also problems with movable type and Chinese characters (i.e.
>>> the number of them!)
>
> In article <hbikbc$vr3$1(a)news.eternal-september.org>,
> Stephen Sprunk <stephen(a)sprunk.org> wrote:
>> You wouldn't need as many different pieces of type as you'd think; most
>> of the more complicated characters are composed of various combinations
>> of simpler sub-characters.
>
> Yes, but few of them are simply separable, and making blocks for the
> sub-characters that fitted together in all the combinations needed.

Actually, a huge number of them are "simply" separable; most of the
complex characters are just two simpler characters side by side (either
50/50 or 33/67 widths), or one atop the other, in various combinations.
This is by design; one subcharacter is the "is like" and the other is
the "sounds like". For instance, the character for "official" (仕) is
formed by combining "is like person" (人) and "sounds like study" (士).
(In speech, "official" must be preceded or followed by another word to
distinguish it from other words that also "sound like study", but that's
not necessary in writing since one can _see_ the "is like" part.)

The result of this design is that one needs to recognize only 250-450
characters to be "literate" in Chinese writing, since those compose the
vast majority of the ones used and can be recognized as the parts of the
rarer, more complex characters; this is very different from alphabetic
languages where one must know tens of thousands of _words_ to be
literate. (In either case, when encountering an unknown character/word,
a literate person will decompose it into parts that _are_ known, and the
result is usually correct. Characters/words where that doesn't work
tend to fall out of use because almost nobody can understand them.)

If necessary, printers could have simply required that texts not use any
characters outside that basic set, or at least none that cannot be
decomposed easily. Everyone wins.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

From: "Andy "Krazy" Glew" on 21 Oct 2009 22:58

Tom Knight wrote:
> "Andy \"Krazy\" Glew" <ag-news(a)patten-glew.net> writes:
>> As for the dynamic power, similar considerations: the total amount of
>> charge switched per cycle must stay below the square root curve. More
>> precisely, the total amount of charge*frequency*voltage must stay
>> below the square root curve.
>
> Resonant power
> recovery (at least of the clocks, and probably of much of the logic
> with reversible techniques) becomes easy. We will do this, but
> probably not in your phone.

Wow! Tom Knight, responding to my comp.arch post!

Tom, I almost said something about power recovery logic in my post. I
am a huge fan. But not necessarily as far out on the path towards
reversibility as your work has been. So much of your work seems (or at
least seemed, in the early days) to be oriented towards the absolute
lowest power.

Myself, I would be happy to save 30%, or 60%, via such techniques. And,
to my limited understanding of this area, which is not my specialty, the
sorts of things that I am thinking of *might* apply to my cell phone.

So, first some questions for you: what application domains do you have
in mind in your work on adiabatic, charge recovery, resonant power
recovery, logic? Space? Operating at cryogenic temperatures?
Supercomputers? Or, motes on Earth?

Also, what keeps your techniques from being used in my cell phone?

Second, I will risk embarassment by talking about my own thoughts in
this area. Disclaimer: I am not a circuits guy. But I never let lack
of expertise prevent me thinking about a topic with common sense.

Back when I was at Illinois some 20 years ago I used to roll my eyes at
the papers on reversible and adiabatic computing. They seemed to me to
be an academic exercise, theoretical. But then I was TA'ing a VLSI
design class at Bell Labs in Naperville, and one of the students, a
physicist, asked me why we don't have chips with AC power supplies.
*That* is what caused the blinders to be removed from my eyes. (Or,
maybe, a new set of distorting lenses to be placed in front of my eyes.)

The basic idea of AC power supplies is that charge does not go all the
way around. It moves in, it moves back. For the most part.

Now, most of our VLSI transistor technologies are assymmetric. You want
P and N devices to be connected to different power rails. Doesn't have
to be so, but that's mainly what we do - and the circuit guys look down
on me for my wild ideas already, so I won't go down that path, or the
asynch path, here. In any case, the circuits are assymetric:
transistors are connected differently to hi and lo.

So: why not have an AC power supply. And define dual circuits. Use
one circuit when power rail A is high and B is low, and the other
circuit when power rail A is low and power rail B is high.

You need devices that can switch off the undesired circuit when the
power rails are in the wrong configuration. And you need devices that
can select which of the duals to be output. These devices must work in
al power configurations.

If your "AC" power supply (actually, the signals on the at-least-2 power
supply rails) was a square wave, this would be enough. But, it won't
be. WLOG, let us imagine that the AC power supply is a sine wave. The
devices will have to work within a range of changing voltages, say when
"high" is between k=1/3 and k=1 Vmax=(Vmean+k*Vswing), and "low" is
between k=1/3 and k=1 Vmin=(Vmean-k*Vswing). Can this be done? (I
think so, but I am not an expert.)

You would also have to idle both of the dual circuits when the rails are
between k=1/3 and k=-1/3. But, fortunately, that is at the fastest
changing part of the cycle.

If you need continuous operation, you could use three power rails, i.e.
three phases.

Distributing the control signals to switch the "dual" (or triple, or
more...) circuits might be a pain.

But, we need not be thinking about the AC ticking away at gigahertz
frequency. I.e. it does not need to be the clock. Why could we not run
the AC power rails at a much lower frequency, and use the dual circuits
for hundreds or thousands or more` clock cycles?

The biggest problem with this is that it requires dual circuits. But
even that's not totally necessary. In many places we have similar,
almost dual, circuits. If we can connect them, and switch them, we
might be able to share these nets.

OK, OK, OK. This is not my area. But I would love to understand WHY
something like this cannot work. Actually, I would love it to work even
more.

From: "Andy "Krazy" Glew" on 21 Oct 2009 23:35

jacko wrote:
> This is where the disco-fet will help a little. As the gate electrons
> are stored within a parallel channel, and do not have to be pulled in/
> out. The lower miller capacitance reduces the dynamic CMOS currents,
> and increases the switching speed, reducing the voltage crowbar effect
> in the switchover. The RC stripline charging is still an issue. Maybe
> the routing should be active inverter chains?

OK, Jacko, I give:

Can you point me (us, comp.arch) to papers on the DISCO FET?

First | Prev | Next | Last
Pages: 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Prev: Travelling, with Power Supplies
Next: Why do every need I-Cache ?