From: Anatoly Chernyshev on
Thanks a lot to all responded. What is clear now is that the problem
is rather complicated. I guess, I'd better be off with distributing
the calculation over several PCs, with one task on each.
From: Charmed Snark on
Anatoly Chernyshev expounded in news:095dc775-e214-4ccd-bf26-45ab27b3b277
@s36g2000prh.googlegroups.com:

> Thanks a lot to all responded. What is clear now is that the problem
> is rather complicated. I guess, I'd better be off with distributing
> the calculation over several PCs, with one task on each.

I was wondering about these issues in connection with
pthreads. I have a basic interpreter project
(https://sourceforge.net/projects/bdbbasic) where I
was considering implementing matrix operations in
multiple threads.

The net wisdom suggested that the thread startup
cost was around 100usec as a general rule. However,
that appears to be only the beginning of important
considerations. ;-)

From the related post here, it seems that the matrix
might need to be huge to be of benefit. But even then,
as the cores and cache interact, this might not improve
performance.

Interesting stuff indeed. Then when you add this to
multiple platforms, a project designer really gets
his hands dirty..

Warren
From: tmoran on
>Thanks a lot to all responded. What is clear now is that the problem
>is rather complicated.
In the early days of virtual memory, one of our users (at U of
Wisconsin) decided to make use of the large memory by declaring large
arrays, which his program accessed by running down columns.
Unfortunately, the compiler allocated them by rows, so his program's
thrashing brought the whole system to a crawl. I imagine there will
have to be a similar learning process for using multicores well.