|
From: Stefan Bellon on 18 Apr 2007 14:13 Hi all! Although I am quite familiar with Ada, tasking is quite new to me, so please bear with me with my question. ;-) I have a set of buckets where I do have to do some processing on them. This processing can be done in parallel for each bucket. The results of the processing however are accumulated into another data structure. At present I have task types for the processing of the buckets, an access type to the task type and basically just do: declare type Task_Access is access all Task_Type; My_Task : Task_Access; begin for I in Buckets'Range loop My_Task := new Task_Type'(Buckets (I)); end loop; end; The result data structure is a protected object with an entry Add that adds the processing result to the container and which is called by the task body. All is well so far. However the number of buckets may be quite large and I have the feeling that the context switching which is needed for say 1000 or more tasks eats up the gain of the parallelism. At least in my test cases I do not gain anything at all on a Core Duo system. Therefore I have the idea of just starting N tasks in parallel (where N can be specified by the user e.g. according to the number of CPU cores of the machine) instead of tasks for all buckets at once. Starting N tasks, then waiting for them to get all finished and only then starting the next N tasks is not difficult. But how would I do it, so that there are always N tasks running (apart of course when everything has been processed) and that a new tasks is starting on the next bucket as soon as a task on a previous bucket has finished? Any ideas are very welcome! -- Stefan Bellon
From: Randy Brukardt on 18 Apr 2007 14:40 "Stefan Bellon" <sbellon(a)sbellon.de> wrote in message news:20070418201307.18a85fd9(a)cube.tz.axivion.com... .... > Therefore I have the idea of just starting N tasks in parallel (where N > can be specified by the user e.g. according to the number of CPU cores > of the machine) instead of tasks for all buckets at once. > > Starting N tasks, then waiting for them to get all finished and only > then starting the next N tasks is not difficult. But how would I do it, > so that there are always N tasks running (apart of course when > everything has been processed) and that a new tasks is starting on the > next bucket as soon as a task on a previous bucket has finished? > > Any ideas are very welcome! I'd suggest organizing the tasks as workers and the buckets as jobs. The idea is that each task is a loop that just gets a job (a bucket in your case), processes it, sends the result, and gets another job, etc. You just need a protected data structure to serve the jobs, and an array of worker tasks, and then you can easy vary N to any value you want to try. Randy.
From: Alex R. Mosteo on 18 Apr 2007 15:50 Stefan Bellon wrote: > Hi all! > > Although I am quite familiar with Ada, tasking is quite new to me, so > please bear with me with my question. ;-) > > I have a set of buckets where I do have to do some processing on them. > This processing can be done in parallel for each bucket. The results > of the processing however are accumulated into another data structure. > > At present I have task types for the processing of the buckets, an > access type to the task type and basically just do: > > declare > type Task_Access is access all Task_Type; > My_Task : Task_Access; > begin > for I in Buckets'Range loop > My_Task := new Task_Type'(Buckets (I)); Be aware that this causes heap consumption. That is, a finished task doesn't just vanishes, it has some heap used for task context that is not automatically released. To properly free a task access you must ensure that the task is terminated (My_Task'Terminated) before doing an Unchecked_Deallocation. If you try to free an unfinished task, the call won't fail but the task context memory won't be freed... I have only GNAT experience though. > end loop; > end; > > The result data structure is a protected object with an entry Add that > adds the processing result to the container and which is called by the > task body. > > All is well so far. > > However the number of buckets may be quite large and I have the feeling > that the context switching which is needed for say 1000 or more tasks > eats up the gain of the parallelism. At least in my test cases I do not > gain anything at all on a Core Duo system. > > Therefore I have the idea of just starting N tasks in parallel (where N > can be specified by the user e.g. according to the number of CPU cores > of the machine) instead of tasks for all buckets at once. > > Starting N tasks, then waiting for them to get all finished and only > then starting the next N tasks is not difficult. But how would I do it, > so that there are always N tasks running (apart of course when > everything has been processed) and that a new tasks is starting on the > next bucket as soon as a task on a previous bucket has finished? > > Any ideas are very welcome! >
From: Jeffrey R. Carter on 18 Apr 2007 16:12 Randy Brukardt wrote: > > I'd suggest organizing the tasks as workers and the buckets as jobs. The > idea is that each task is a loop that just gets a job (a bucket in your > case), processes it, sends the result, and gets another job, etc. You just > need a protected data structure to serve the jobs, and an array of worker > tasks, and then you can easy vary N to any value you want to try. In this case, I'd say all that's needed is a protected object to supply the index into the Buckets array of the next bucket needing processing. -- Jeff Carter "Blessed is just about anyone with a vested interest in the status quo." Monty Python's Life of Brian 73
From: Leif Holmgren on 18 Apr 2007 17:08 Randy Brukardt wrote: > "Stefan Bellon" <sbellon(a)sbellon.de> wrote in message >> >>Starting N tasks, then waiting for them to get all finished and only >>then starting the next N tasks is not difficult. .... > and an array of worker > tasks, and then you can easy vary N to any value you want to try. If my memory does not fail me the advice to use an array here is double good. First of all you don't need to bother with the nitty gritty details of dynamic allocation yourself. Secondly and perhaps most important Ada will handle the synchronization of task termination for you automatically. It will not allow the array to go out of scope before all the tasks are completed. Years ago I implemented such a system (using dynamically allocated tasks) and it worked very well. By doing as suggested you will gain maximum performance even if the buckets take different time to process. /Leif
|
Next
|
Last
Pages: 1 2 3 4 Prev: SOLVED: Debian Etch GCC GNAT 4.1 / Aonix ADT Eclipse Plugin Next: Ada on Mac |