From: Del Cecchi on

<nmm1(a)cam.ac.uk> wrote in message
news:hhpvuc$h3m$1(a)smaug.linux.pwf.cam.ac.uk...
> In article <7qacstFk5U1(a)mid.individual.net>,
> Del Cecchi <delcecchi(a)gmail.com> wrote:
>>
>>Since I've argued that super-sized computers seem to me to be of
>>questionable value, maybe that's all the vast bulk of science really
>>needs. If I really need a computer with respectable bi-section
>>bandwidth, I can skip waiting for a gigantic machine that runs at 5%
>>efficiency (or worse) and learn to live with whatever I can build
>>myself.
>
> Bisection bandwidth is the Linpack measurement of networking, but
> that doesn't affect your point.
>
> I favoured (and favour) the latency and bandwidth of all-to-all,
> both because it is more realistic and because it is invariant over
> the network topology and implementation.
>
> The only real point of the specialist HPC systems nowadays is when
> the problems are limited by communication and not processing, but
> it's more a situation of how you spend your money than differences
> in architecture. There isn't any major technical problem with using
> multiple InfiniBand ports on each node, and a suitable topology; the
> software can get interesting, though.
>
> However, people who do that usually want SMALLER nodes, because that
> increases the network capacity relative to computation power. It's
> expensive to scale the former pro-rata to the latter, and is often
> not feasible with off-the-shelf components. I.e. they use single-
> socket nodes.
>
>
> Regards,
> Nick Maclaren.


You keep this up and I will have to dust off thunderbird for comp.arch
posting since it seems to get the quoting/attribution correct even for
robert's posts from google, unlike oe. This will eliminate the
concern with people putting robert's words in my mouth as above.

del



From: nmm1 on
In article <7qc2o6Fi8iU1(a)mid.individual.net>,
Del Cecchi <delcecchi(a)gmail.com> wrote:
>
>You keep this up and I will have to dust off thunderbird for comp.arch
>posting since it seems to get the quoting/attribution correct even for
>robert's posts from google, unlike oe. This will eliminate the
>concern with people putting robert's words in my mouth as above.

Oops. Sorry. I read that horribly mangled reply several times to
disentangle who said what, but evidently got it wrong. My excuse is
that I have been down with a foul cold.


Regards,
Nick Maclaren.
From: Robert Myers on
On Jan 3, 10:03 am, Mayan Moudgill <ma...(a)bestweb.net> wrote:

> When I asked for a physics problem that was being tackled on BlueGene
> that couldn't be done on a CoW/NoW, I hadn't considered the possibility
> that BlueGene itself was the physics experiment! I'm assuming (based on
> the reply) the scientists are using the spot heating due to the CPU and
> the resulting heat transfer to mirror some real world physics problem.
>
> Or did I miss the point entirely?

You did sort of miss the point. I could have answered directly by
saying that that you couldn't do the bomb labs' problems on a cluster
of workstations. You can do the *kind* of problem the labs are most
interested in (principally fluid mechanics and radiative transfer) on
a Beowulf cluster, but not the size of problem. At the time the
first Blue Genes were built, you'd have needed too much real estate
(because the computational density of Blue Gene is hard to beat) and
too much power.

At the time Blue Gene was built, what LLNL wanted to do (at least to
the limits of what I know about it) stretched the limits of what was
credible. Because of advances in hardware, I don't belief that is any
longer the case. If you wanted to duplicate the capability with a
cluster of commodity hardware, the hardest requirement to meet would
be the network latency.

I assume that most who buy installations like Blue Gene would have RAS
requirements that would be hard or impossible to meet with a Beowulf
cluster. In the end, it's probably RAS that rules.

Most of the problems, including RAS, go away or are tolerable if you
are willing to do problems that are not so big.

Robert.
From: Robert Myers on
On Jan 3, 10:52 am, n...(a)cam.ac.uk wrote:
> In article <dtKdnWNf3IldLN3WnZ2dnUVZ_gGdn...(a)bestweb.net>,
> Mayan Moudgill  <ma...(a)bestweb.net> wrote:
>
>
>
> >When I asked for a physics problem that was being tackled on BlueGene
> >that couldn't be done on a CoW/NoW, I hadn't considered the possibility
> >that BlueGene itself was the physics experiment! I'm assuming (based on
> >the reply) the scientists are using the spot heating due to the CPU and
> >the resulting heat transfer to mirror some real world physics problem.
>
> The CETEP/CASTEP/ONETEP series of ab initio quantum mechanics programs
> alternate between the solution of a large set of simultaneous equations
> and a very large 3-D FFT.  The latter, like sorting, is limited by
> communication.  It is (amongst other things) used for surface chemistry
> caclulations, which are of immense industrial importance.
>
> There are dozens of other examples - that's just one I happen to know
> a little about.

Had Mayan asked for examples of problems that are ill-suited to Blue
Gene, I could have answered with any problem that relies on a global
FFT, but that isn't the question he asked.

Robert.

From: Robert Myers on
On Jan 3, 12:34 pm, "Del Cecchi" <delcec...(a)gmail.com> wrote:

> You keep this up and I will have to dust off thunderbird for comp.arch
> posting since it seems to get the quoting/attribution correct even for
> robert's posts from google, unlike oe.   This will eliminate the
> concern with people putting robert's words in my mouth as above.

I believe that the problem is that Google groups posts in html, rather
than in plain text. There is a plain text option for gmail, but not
for Google groups.

Robert.