From: Ludovic Brenta on
I can see the value in having your program discover the number of
hardware threads and starting one task on each. This value, however,
is only for quick and dirty solutions. A proper, more long-term and
Ada-like solution, takes into account a lot of factors:

* if a single processor core provides several threads, these threads
share execution units in the core, so assigning compute-intensive
tasks to such threads is counter-productive; in contrast, assigning
tasks that spend a lot of time waiting for external events to occur is
a good idea. So, for your problem, you probably want at most one task
per core, not one task per thread.

* if hardware threads or cores share a common data cache, using too
many tasks may result in cache contention and slow-downs.

* each task will need virtual memory; possibly lots of it; and backed
up by physical memory or you'll cause thrashing. So the number of
tasks you can run is constrained not only by the amount of hardware
threads and cores but also by the amount of physical memory you can
allocate to your program.

* finally, the system administrator must have the last word on how
many tasks your program should use; they might choose to spare one
processor core for the OS or interactive use while your program runs
in the background; or they can run several programs concurrently and
only allocate a fraction of the processors available to your program.

So, having your program discover the number of threads (more usefully,
cores) in the system should only provide a hint as to how many tasks
you can start and you should also provide at least a command-line
option for use by the administrator.

HTH

--
Ludovic Brenta.
From: Egil Høvik on
On Jul 12, 2:47 am, "Peter C. Chapin" <pcc482...(a)gmail.com> wrote:
> As we all know there is currently a lot of "buzz" about parallel
> programming and about the "crisis in software development" that is
> associated with it. People using languages with weak support for
> concurrency are now wondering how they will use multi-core systems
> effectively and reliably. Of course Ada already has decent support for
> concurrency so part of the problem is solved for those of us using Ada.
>
> But not all of the problem...
>
> I see a couple of difficulties with writing effective parallel programs
> for "ordinary" applications (that is, applications that are not
> embarrassingly parallel). One difficulty is load balancing: how can one
> decompose a problem to keep all processors reasonably busy? The other
> difficulty is scalability: how can one design a single program that can
> use 2, 4, 16, 128, or more processors effectively without knowing ahead
> of time exactly how many processors there will be? I'm not an expert in
> Ada tasking but it seems like these questions are as big a problem for
> Ada as they are for any other language environment.
>
> I'm not looking for a solution to all tasking problems here. But there
> is one feature that seems like a necessary prerequisite to such a
> solution. The language (or its standard library) needs to provide a
> portable way for the program to determine how many hardware threads are
> available.
>
> I'm about to write a simple program that decomposes into parallel,
> compute-bound tasks quite nicely. How many such tasks should I create? I
> could ask the user to provide the number as a command line argument or
> in a configuration file. Yet it seems like the program should just be
> able to figure it out. Does Ada have a standard way of doing that? I
> didn't see anything in my (admittedly short) review.
>
> Thanks!
>
> Peter


Have a look at this proposal for Ada 2012:
http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai05s/ai05-0167-1.txt?rev=1.4

--
~egilhh
From: Pascal Obry on

Jeffrey,

> GNAT has function System.Task_Info.Number_Of_Processors. But something
> standard would be better.

http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai05s/ai05-0167-1.txt?rev=1.3

--

--|------------------------------------------------------
--| Pascal Obry Team-Ada Member
--| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE
--|------------------------------------------------------
--| http://www.obry.net - http://v2p.fr.eu.org
--| "The best way to travel is by means of imagination"
--|
--| gpg --keyserver keys.gnupg.net --recv-key F949BD3B

From: Warren on
Dmitry A. Kazakov expounded in
news:xah7sg1n8ib8$.26txy5fbnmui.dlg(a)40tude.net:

> On Sun, 11 Jul 2010 20:47:41 -0400, Peter C. Chapin wrote:
>
>> I see a couple of difficulties with writing effective parallel
>> programs for "ordinary" applications (that is, applications that are
>> not embarrassingly parallel). One difficulty is load balancing: how
....
>
> Well, maybe, but I don't think it would bring much. Especially because
> normally cores support multi-tasking. It would be more important for
> the architectures with the cores that do not (GPU etc).

I was thinking about GPUs recently. Is there any Ada efforts
aimed in that direction?

I've read about C based interfaces (CUDA) for working with GPUs,
which naturally aroused the Ada language question.

Warren
From: Gene on
On Jul 12, 3:40 am, "Dmitry A. Kazakov" <mail...(a)dmitry-kazakov.de>
wrote:
> On Sun, 11 Jul 2010 20:47:41 -0400, Peter C. Chapin wrote:
> > I see a couple of difficulties with writing effective parallel programs
> > for "ordinary" applications (that is, applications that are not
> > embarrassingly parallel). One difficulty is load balancing: how can one
> > decompose a problem to keep all processors reasonably busy? The other
> > difficulty is scalability: how can one design a single program that can
> > use 2, 4, 16, 128, or more processors effectively without knowing ahead
> > of time exactly how many processors there will be? I'm not an expert in
> > Ada tasking but it seems like these questions are as big a problem for
> > Ada as they are for any other language environment.
>
> Back in 90's, during the era of transputers, concurrent algorithms were
> decomposed knowing in advance the number of processors and the topology of
> the network of. (Unlike to multi-cores the transputers didn't share memory,
> they communicate over serial links connected physically) That time the
> consensus was that the problem is not solvable in general. So you designed
> up front both the algorithm and the topology.
>
> > I'm not an expert in
> > Ada tasking but it seems like these questions are as big a problem for
> > Ada as they are for any other language environment.
> > I'm not looking for a solution to all tasking problems here. But there
> > is one feature that seems like a necessary prerequisite to such a
> > solution. The language (or its standard library) needs to provide a
> > portable way for the program to determine how many hardware threads are
> > available.
>
> Well, maybe, but I don't think it would bring much. Especially because
> normally cores support multi-tasking. It would be more important for the
> architectures with the cores that do not (GPU etc).
>
> BTW, "hardware thread" = core? processor? ALU + an independent memory
> channel? etc. It is quite difficult to define and the algorithm's
> performance may heavily depend on the subtleness. ARG would say, look, it
> does not make sense for all platforms, forget it.
>
> > I'm about to write a simple program that decomposes into parallel,
> > compute-bound tasks quite nicely. How many such tasks should I create?
>
> Back in time it was popular to make it adaptive. I.e. you monitor the
> performance and adjust the size of the working threads pool as you go. I
> remember some articles on this topic, but it was long long ago...
>
> --
> Regards,
> Dmitry A. Kazakovhttp://www.dmitry-kazakov.de

Dmitry has it exactly right. When a "hardware thread" can be anything
from a fraction of a core to a node on an Ethernet-connected cluster,
how many you have is not such an important question.

Guessing high and allowing the OS to sort things out has severe
limitations. What to guess? 4? 16? 256?

When there is only a small number of independent performance control
variables, e.g. one in the case of a common thread pool, the self-
monitoring and adjustment scheme is the only one with legs.

Other multi-threading models like work stealing present a more complex
tuning problem. Evolutionary (genetic) algorithms ought to be a
powerful way for software to adapt to its own environment. Is there
anyone doing work on this?