From: Nomen Nescio on
I worked in geophysical processing industry. Pretty much
all of them now use linux clusters of 128-1000 nodes
attached to a giant SAN. While there are a few algorithms
that need brute-force gigaflops, much is stifled by the
data shuffling.
Now I can see a change coming. I know a guy, who is going
into business by himself. He has built (with little help)
a box under his desk with 4 Magny Cours, 128 GB RAM and
28 TB of raid (plus a SSD of 256 GB). This would be
sufficient to run small projects. He claimed it has
faster throughput than the industry-standard
"Mainframe" as he snidely calls them.
Do you think this could catch on?

From: fatalist on
On Aug 8, 10:49 pm, Nomen Nescio <nob...(a)dizum.com> wrote:
> I worked in geophysical processing industry. Pretty much
> all of them now use linux clusters of 128-1000 nodes
> attached to a giant SAN. While there are a few algorithms
> that need brute-force gigaflops, much is stifled by the
> data shuffling.
> Now I can see a change coming. I know a guy, who is going
> into business by himself. He has built (with little help)
> a box under his desk with 4 Magny Cours, 128 GB RAM and
> 28 TB of raid (plus a SSD of 256 GB). This would be
> sufficient to run small projects. He claimed it has
> faster throughput than the industry-standard
> "Mainframe" as he snidely calls them.
> Do you think this could catch on?

Linux clusters as they used to be (a bunch of connected machines
WITHOUT shared memory running some kind of MPI protocol to communicate
with each other) will die out

The GPUs from NVidia (and ATI/AMD) are quickly becoming a standard in
HPC

They can beat hands down any multicore processor on tasks which can be
efficiently parallelized

http://www.hpcprojects.com/features/feature.php?feature_id=265
From: Vladimir Vassilevsky on


Nomen Nescio wrote:
> I worked in geophysical processing industry. Pretty much
> all of them now use linux clusters of 128-1000 nodes
> attached to a giant SAN. While there are a few algorithms
> that need brute-force gigaflops, much is stifled by the
> data shuffling.
> Now I can see a change coming. I know a guy, who is going
> into business by himself. He has built (with little help)
> a box under his desk with 4 Magny Cours, 128 GB RAM and
> 28 TB of raid (plus a SSD of 256 GB). This would be
> sufficient to run small projects. He claimed it has
> faster throughput than the industry-standard
> "Mainframe" as he snidely calls them.
> Do you think this could catch on?

The size and throughput of mainframes is no big deal for geophysical
industry.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com
From: Fred Marshall on
Nomen Nescio wrote:
> I worked in geophysical processing industry. Pretty much
> all of them now use linux clusters of 128-1000 nodes
> attached to a giant SAN. While there are a few algorithms
> that need brute-force gigaflops, much is stifled by the
> data shuffling.
> Now I can see a change coming. I know a guy, who is going
> into business by himself. He has built (with little help)
> a box under his desk with 4 Magny Cours, 128 GB RAM and
> 28 TB of raid (plus a SSD of 256 GB). This would be
> sufficient to run small projects. He claimed it has
> faster throughput than the industry-standard
> "Mainframe" as he snidely calls them.
> Do you think this could catch on?
>

As one who worked in highly parallel computing some years ago(more DSP
than HPC - although they overlapped), the issue was then and apparently
still is now: "How do you program these things?".

It seems obvious that the hardware will continue (along some curve) to
improve and the price of a GOP will fall. That's only sorta interesting
and the trend to doing more for less $$ will continue. [stifling a
small yawn ... is that SASY?].

A very recent article in one of the IEEE or ACM journals was about how
tough getting parallel systems to yield performance - and this was tied
to the multicore chips and systems we're seeing today. Sure, some very
focused jobs lend themselves to parallel processing but many more seem
elusive. I believe that geophysical processing lends itself reasonably
well but I'm no expert in exactly what/how they do what they do.

The Portland Group has been around for a long time working on the
software technology for these things. I just read from their website:
"A CUDA programmer is required to partition the program into coarse
grain blocks that can be executed in parallel."
Some related folks (in the late '80s) talked about optimizing compilers
that would target heterogeneous parallel machines based on a "partition
spec". Sounds like the quote above, eh?
The idea was to write the program, specify the partitioning and then
manually iterate the partitioning to remove bottlenecks.
I don't sense that things have improved all that much in the last 20
years. Well, compute power has certainly increased but our underlying
technical ability to make good/general use of parallel machines is still
at issue.

Fred


From: glen herrmannsfeldt on
In comp.dsp Fred Marshall <fmarshall_xremove_the_xs(a)xacm.org> wrote:
(snip)

> A very recent article in one of the IEEE or ACM journals was about how
> tough getting parallel systems to yield performance - and this was tied
> to the multicore chips and systems we're seeing today. Sure, some very
> focused jobs lend themselves to parallel processing but many more seem
> elusive. I believe that geophysical processing lends itself reasonably
> well but I'm no expert in exactly what/how they do what they do.

> The Portland Group has been around for a long time working on the
> software technology for these things. I just read from their website:
> "A CUDA programmer is required to partition the program into coarse
> grain blocks that can be executed in parallel."

There are languages designed around the idea of partitioning,
such that the programmer doesn't have to follow the details
of the partitions. They tend to look a little different from
the popular serial languages.

> Some related folks (in the late '80s) talked about optimizing compilers
> that would target heterogeneous parallel machines based on a "partition
> spec". Sounds like the quote above, eh?
> The idea was to write the program, specify the partitioning and then
> manually iterate the partitioning to remove bottlenecks.

That is always the problem.

> I don't sense that things have improved all that much in the last 20
> years. Well, compute power has certainly increased but our underlying
> technical ability to make good/general use of parallel machines is still
> at issue.

-- glen