From: nmm1 on
In article <b71929a1-57e6-4bbe-a1f1-380fa8579970(a)d8g2000yqf.googlegroups.com>,
sturlamolden <sturlamolden(a)yahoo.no> wrote:
>On 17 Jul, 03:08, gmail-unlp <ftine...(a)gmail.com> wrote:
>
>> 1) Making oarallel programs while forgetting (parallel) performance
>> issues is a problem. And OpenMP helps, in some way, to forget
>> important performance details such as pipelining, memory hierarchy,
>> cache coherence, etc. However, if you remember you are parallelizing
>> to improve performance I think you will not forget performance
>> penalties and implicitly or explicitly optimize data traffic, for
>> example.
>
>We should not forget that OpenMP is often used on "multi-core
>processors". These are rather primitive parallel devices, they e.g.
>have shared cache. Data traffic due to OpenMP can therefore be
>minimal, because a dirty cache line need not be communicated. So if
>the target is common desktop computers with quadcore Intel or AMD
>CPUs, OpenMP can be perfectly fine. And this is the common desktop
>computer these days. So for small scale parallelization on modern
>desktop computers, OpenMP can be very good. But on large servers with
>multiple processors, OpenMP can generate excessive data traffic and
>scale very badly.

While that is true, it's very partial and is very misleading. The
days of a single-level cache are gone, and modern CPUs are going
multi-level even internally - let alone on a multi-socket desktop!
Once you do that, accessing the same dirty cache line from different
CPUs becomes a problem, and many codes are not scalable even to those
systems for that very reason.

>P.S. It is a common misconception, particularly among computer science
>scholars, that "shared memory" means no data traffic, and that threads
>are better then processes for that matter. I.e. they can see that IPC
>has a cost, and thus conclude that threads must be more efficient and
>scale better. The lack of a native fork() on Windows has also thought
>many of them to think in terms of threads rather than processes. The
>use of MPI seem to be limited to scientists and engineers, the
>majority of computer scientists don't even know what it is.
>Concurrency to them means threads, and particularly C++ classes that
>wrap threads. Most of the expect i/o bound programs that use threads
>to be faster on multi-core computers, and they wonder why parallel
>programming is so hard.

That is very true.


Regards,
Nick Maclaren.