From: MitchAlsup on
On Mar 15, 1:28 pm, an...(a)mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> MitchAlsup <MitchAl...(a)aol.com> writes:

>  But reading your description again, maybe you mean:
>
> | | | | | | | I/O cables (disks)
> ############# DRAM boards
> ############# (we see only the one in front)
> ---------------------- backplane
> | | | | | | | | | | | | | CPU boards
> | | | | | | | | | | | | | (edge view, we see all of them in this view)
> \ \ \ \ \ Other I/O cabling
>
> Did I get that right?  That would be 1/2m x 1/2m x 1m.

About as good as ASCII art can do. No motherboard and wrap the
ensemble with a metal skeleton to hold the boards, fans, and route
power and provide for good looks.

Mitch
From: MitchAlsup on
On Mar 15, 1:41 pm, Robert Myers <rbmyers...(a)gmail.com> wrote:
> On Mar 14, 4:35 pm, "Del Cecchi" <delcec...(a)gmail.com> wrote:
>
>
>
> > Golly, how much memory do you want.
>
> The 4TB quoted is about where the interesting science would begin.

And, right now, about as much as can be packaged in one 'frame'.

I realize that overall, this idea-set is targeted towards the database
server more than the supercomputer.

Mitch
From: Robert Myers on
On Mar 15, 9:00 pm, MitchAlsup <MitchAl...(a)aol.com> wrote:
> On Mar 15, 1:41 pm, Robert Myers <rbmyers...(a)gmail.com> wrote:
>
> > On Mar 14, 4:35 pm, "Del Cecchi" <delcec...(a)gmail.com> wrote:
>
> > > Golly, how much memory do you want.
>
> > The 4TB quoted is about where the interesting science would begin.
>
> And, right now, about as much as can be packaged in one 'frame'.
>
> I realize that overall, this idea-set is targeted towards the database
> server more than the supercomputer.
>
But if all I need to do is to stream data (no addressing, just a
serial stream), I should be able to connect several frames at very
high bandwidth using optical cable, if necessary? The Fourier
transform (with complicated addressing) would only need to be done
within a frame. It's not a huge deal. Maybe only a factor of two in
scale separation, but it would be nice to have.

Robert.

From: "Andy "Krazy" Glew" on
Terje Mathisen wrote:
> Larry wrote:
>> Finally, I want to point out that good global communications can be
>> done, it is just expensive. No magic is necessary. All you need to
>> do is build a sufficiently large crossbar switch. These can have
>> modular components, and it is "just" an engineering problem. Of
>> course it's costs go as N**2 in the number of nodes.
>>
>> Alternatively, you can build a fully configured fat tree, which only
>> costs NlogN In either case, the cost of the communications is going
>> to dominate the cost of the system at larger scale. You get logN
>> latency with the fat tree, rather than constant, however.
>
> Hmmm?
>
> I would assume that when you get big enough, even a very large crossbar
> could not scale much better than sqrt(N), since that is the increase in
> wire distance?

The naive planar layout for a fat tree - a 1D line of processors, with tracks for wires - is N lg N in nodes, but is
2(N^2) in "tracks^2".

The less naive planar layout - almost like H-trees - is the recurrence A(2N) = 3*A(N) =9*A(N/2) = ... = 3^log2(N)*A(1) =
N^log2(3). Better than N^2, but less regular.

The large areas spent in switching nodes as the fat tree gets larger are an almost irresistible temptation, to add
processing elements in the middle of the tree.
From: Del Cecchi on

"Andy "Krazy" Glew" <ag-news(a)patten-glew.net> wrote in message
news:4B9F1E96.8000205(a)patten-glew.net...
> Terje Mathisen wrote:
>> Larry wrote:
>>> Finally, I want to point out that good global communications can
>>> be
>>> done, it is just expensive. No magic is necessary. All you need
>>> to
>>> do is build a sufficiently large crossbar switch. These can have
>>> modular components, and it is "just" an engineering problem. Of
>>> course it's costs go as N**2 in the number of nodes.
>>>
>>> Alternatively, you can build a fully configured fat tree, which
>>> only
>>> costs NlogN In either case, the cost of the communications is
>>> going
>>> to dominate the cost of the system at larger scale. You get logN
>>> latency with the fat tree, rather than constant, however.
>>
>> Hmmm?
>>
>> I would assume that when you get big enough, even a very large
>> crossbar could not scale much better than sqrt(N), since that is
>> the increase in wire distance?
>
> The naive planar layout for a fat tree - a 1D line of processors,
> with tracks for wires - is N lg N in nodes, but is 2(N^2) in
> "tracks^2".
>
> The less naive planar layout - almost like H-trees - is the
> recurrence A(2N) = 3*A(N) =9*A(N/2) = ... = 3^log2(N)*A(1) =
> N^log2(3). Better than N^2, but less regular.
>
> The large areas spent in switching nodes as the fat tree gets larger
> are an almost irresistible temptation, to add processing elements in
> the middle of the tree.

And I might add that "no magic, just a sufficiently large crossbar
switch..." is slight of hand at best. Consider a network of 10,000
nodes. Is one to build a single crossbar (non blocking I presume)
that has 10,000 input ports and 10,000 output ports all running at
some large bandwidth? What is it supposed to do if two of them want
to talk to the same output? buffer things up? Is there some sort of
router table or is this a single stage thing so routing is implicit in
the destination? Does it do clock frequency matching?

As someone once said in a different context "the network is the
computer"

Nothing personal Andy, yours was just a convenient post to reply to.