I've got a gradle build process that dynamically generates a list of ~200 homogeneous, CPU-intensive processing tasks (all of the same task type). The tasks in this list do not depend on one another, but they have commen ancestors and common descendants in my build.
It doesn't look like gradle has a built-in way to let me process these in parallel. (The only parallel abstraction I see has to do with decoupled projects).
Are there any techniques I could apply on top, or am I stuck?
For example, could I dynamically create n projects (somehow "decoupled" from the main project to satisfy Gradle's parallel processing requirements), each with 1/n of the tasks I need?
Tasks belonging to the same project cannot currently be executed in parallel. You could script settings.gradle to create many subprojects and script build.gradle to assign tasks to projects, but that might not be a very convenient solution. Another option is to have a single task that internally parallelizes the work (e.g. using the GPars library).
Related
I'm new to Dask
what i'm trying to find is "shared array between processes and it needed to be writable by any proccess"
could someone can show me that?
Top
a way to implement shared writable array in dask
Dask's internal abstraction is a DAG, a functional graph in which it is assumed that tasks act the same should you rerun them ("functionally pure"), since it's always possible that a task runs in two places, or that a worker which holds a task's output dies.
Dask does not, therefore, support mutable data structures as task inputs/outputs normally. However, you can execute tasks that create mutation as a side-effect, such as any of the functions that write to disk.
If you are prepared to set up your own shared memory and pass around handles to this, there is nothing stopping you from making functions that mutate that memory. The caveats around tasks running multiple times hold, and you would be on your own. There is no mechanism currently to do this kind of thing for you, but it is something I personally intend to investigate within the next few months.
Assume im trying to run parallel program with 3 different tasks on quad core processor
my question is ,when these tasks run simultaneously ,will they be computed on each core of processor
or in what way they are executed simultaneously?
If you are using c# and parallel lib then yes, they would get queued up in the thread pool and executed in parallel, but there are few other factors that are very important to consider.
Such as:
- Is there is any shared data?
- Does one process need to wait on another?
Also order of execution is not guaranteed.
For the sake of argument i'm trying to define an algorithm that creates
a pool of tasks (each task has an individual time-slice to operate)
and managing them without the system clock.
The problem that i've encountered is that with each approach that I was taking,
the tasks with the higher time-slice were starved.
What i've tried to do is creating a tasks vector that contains a pair of task/(int) "time".
I've sorted the vector with the lowest "time" first & then iterated and executed each task with a time of zero (0). While iterating through the entire vector i've decreased the "time" of each task.
Is there a better approach to this kind of "problem". With my approach, startvation definitely will occur.
Managing tasks may not need a system clock at all.
You only need to figure a way to determine the priority between each task then run each task following their priority.
You may want to pause a task to execute another task and then will need to set a new priority to the paused task. This feature (Multitasking) will need an interruption based on an event (usually clock-time, but you can use any other event like temperature or monkey pushing a button or another process sending a signal).
You say your problem is that tasks with the higher time-slice are starving.
As you decrease the 'time' of each task when running it and assuming 'time' will not be negative, higher time-slice tasks will eventually get to 0 and as well as every other task.
I would like to know how the task scheduling of the OpenMP task queue is performed.
Here I read that, by default, OpenMP imposes a breadth-first scheduler and that they did some tests FIFO vs. LIFO, but they don't say anything about the default. Since I only have a single thread (I use the single directive) creating multiple tasks, I don't think it makes any sense comparing their breadth-first vs work-first scheduling.
So, is the default FIFO or LIFO? And is it possible to change it?
Thanks
I would like to know how the task scheduling of the OpenMP task queue is performed
Abstract version
Task scheduling in OpenMP is implementation defined, even though the standard imposes some restrictions on the algorithm. Should you need to manipulate the scheduler, the place to search is the particular OpenMP implementation you are targeting.
The long tale
The basic concept upon which all the task scheduling machinery is defined is that of task scheduling point (see section 2.11.3):
Whenever a thread reaches a task scheduling point, the implementation
may cause it to perform a task switch, beginning or resuming execution
of a different task bound to the current team.
In the notes below they give a broader explanation of what should be the expected behavior (emphasis mine):
Task scheduling points dynamically divide task regions into parts.
Each part is executed uninterrupted from start to end. Different parts
of the same task region are executed in the order in which they are
encountered. In the absence of task synchronization constructs, the
order in which a thread executes parts of different schedulable tasks
is unspecified.
A correct program must behave correctly and consistently with all
conceivable scheduling sequences that are compatible with the rules
above
...
The standard also specifies where task scheduling points are implied:
the point immediately following the generation of an explicit task
after the point of completion of a task region
in a taskyield region
in a taskwait region
at the end of a taskgroup region
in an implicit and explicit barrier region
the point immediately following the generation of a target region
at the beginning and end of a target data region
in a target update region
and what a thread may do when it meets one of them:
begin execution of a tied task bound to the current team
resume any suspended task region, bound to the current team, to which it is tied
begin execution of an untied task bound to the current team
resume any suspended untied task region bound to the current team.
It says explicitly, though:
If more than one of the above choices is available, it is unspecified
as to which will be chosen.
leaving space for different conforming behaviors. It only imposes four constraints:
An included task is executed immediately after generation of the task.
Scheduling of new tied tasks is constrained by the set of task regions that are currently tied to the thread, and that are not
suspended in a barrier region. If this set is empty, any new tied task
may be scheduled. Otherwise, a new tied task may be scheduled only if
it is a descendent task of every task in the set.
A dependent task shall not be scheduled until its task dependences are fulfilled.
When an explicit task is generated by a construct containing an if clause for which the expression evaluated to false, and the previous
constraints are already met, the task is executed immediately after
generation of the task.
that every scheduling algorithm must fulfill to be considered conforming.
In an embedded project,we are facing difficulties in deciding which scheduling policy to use.For certain testcases to pass, we need to use SCHED_OTHER and for some other test cases we need to use SCHED_RR.But if we set SCHED_RRfor some task and rest as SCHED_OTHER,all the test cases are passing.Was it legal and are there any additional side effects for such usage of two policies in the same project?
I assume you are talking about Linux? Then yes, it is perfectly acceptable to have some tasks running with SCHED_RR and others running with SCHED_OTHER.
Note that SCHED_RR tasks will always get to run ahead of SCHED_OTHER tasks. So it is not surprising that your tests runs better if you set your tasks to SCHED_RR. The thing to watch out for is that your SCHED_RR tasks might use 100% of the CPU, and starve the SCHED_OTHER tasks. Maybe this is happening when you say some input is getting dropped.
Michael