Peano
|
Interface for any task orchestration. More...
#include <Strategy.h>
Data Structures | |
struct | FuseInstruction |
Public Types | |
enum class | ExecutionPolicy { RunSerially , RunParallel } |
Provide hint of execution policy. More... | |
Public Member Functions | |
virtual | ~Strategy ()=default |
virtual void | startBSPSection (int nestedParallelismLevel)=0 |
Notifies the strategy that we enter a BSP section. | |
virtual void | endBSPSection (int nestedParallelismLevel)=0 |
Notifies the strategy that we leave a BSP (fork-join) section. | |
virtual FuseInstruction | fuse (int taskType)=0 |
How many tasks shall system hold back from tasking runtime in user-defined queues. | |
virtual ExecutionPolicy | paralleliseForkJoinSection (int nestedParallelismLevel, int numberOfTasks, int taskType)=0 |
Determine how to handle/realise parallelisation within fork/join region. | |
Static Public Attributes | |
static constexpr int | EndOfBSPSection = -1 |
Interface for any task orchestration.
There are multiple orchestration strategies implementing this interface and hence guiding the task execution pattern. You can create those and make them live by calling tarch::multicore::setOrchestration().
Definition at line 25 of file Strategy.h.
|
strong |
Provide hint of execution policy.
Enumerator | |
---|---|
RunSerially | |
RunParallel |
Definition at line 38 of file Strategy.h.
|
virtualdefault |
|
pure virtual |
Notifies the strategy that we leave a BSP (fork-join) section.
Implemented in benchmarks::exahype2::ccz4::MulticoreOrchestration, tarch::multicore::orchestration::AllOnGPU, and tarch::multicore::orchestration::Hardcoded.
|
pure virtual |
How many tasks shall system hold back from tasking runtime in user-defined queues.
Tell the runtime system how many tasks to hold back: If there are more tasks than the result, the tasking system will map them onto native tasks. As long as we have fewer tasks than this number, the runtime system will store tasks in its internal queue and not pass them on. Holding tasks back gives us the opportunity to fuse tasks, and it reduces pressure from the underlying task system. It also is an implicit priorisation, i.e. tasks that we hold back are ready, but as we do not pass them on to the tasking runtime, they implicitly have ultra-low priority.
My data suggest that it is a very delicate decision to hold back tasks, as you run risk all the time that you starve threads even though work would be available. I recommend to hold back tasks - in line with the text above - iff
The routine is not const, as I want strategies give the opportunity to adopt decisions after each call.
This routine is called once per task spawned (to know if we maybe should immediately map it onto a native task), and then at each end of the BSP section. When it is called for a particular task, we pass in a proper task type. That is, the decision of the strategy may depend on the type of the task for which we ask. At the end of a BSP section, we pass in tarch::multicore::orchestration::Strategy::EndOfBSPSection instead of a particular task type.
tarch::multicore::spawnAndWait() is the routine which triggers the query for the end of a BSP section. If we have N tasks and N is bigger than the result of this outine, it will map tasks onto native tasks through internal::mapPendingTasksOntoNativeTasks().
tarch::multicore::spawnTask() is the routine which queries this routine for each and every task.
taskType | Either actual task type if we get a task or EndOfBSPSection if it is not asked for a particular task type or, well, at the end of a fork-join part. How many tasks to fuse and to which device to deploy |
Return a triple modelled via a FuseInstruction object.
taskType | Either actual task type if we get a task or EndOfBSPSection if it is not asked for a particular task type or, well, at the end of a fork-join part. |
Implemented in tarch::multicore::orchestration::AllOnGPU, and tarch::multicore::orchestration::Hardcoded.
Referenced by tarch::multicore::taskfusion::ProcessReadyTask::run().
|
pure virtual |
Determine how to handle/realise parallelisation within fork/join region.
Peano models its execution with multiple parallel, nested fork/join sections. You could also think of these as mini-BSP sections. This routine guides the orchestration how to map those BSP sections onto tasks.
The decision can be guided by basically arbitrary contextual factors. The most important one for me is the nesting factor. As we work mainly with OpenMP, where tasks are tied to one core, it makes limited sense to have nested parallel fors. Notably, it makes stuff slower. So usually, I return ExecutionPolicy::RunSerially with anything with a nesting level greater than 1.
nestedParallelismLevel | Please compare with tarch::multicore::spawnAndWait() which ensures that this flag equals 1 on the top level. A parameter of 0 would mean that no fork/join region has been opened. For such a parameter, the code would not query this function. |
taskType | If we enter a fork-join section, this section logically spawns a set of tasks, which are all of the same type. So the task type here is given implicitly by the code location. But each BSP section has a unique identifier. |
Implemented in benchmarks::exahype2::ccz4::MulticoreOrchestration, tarch::multicore::orchestration::AllOnGPU, tarch::multicore::orchestration::Hardcoded, and tarch::multicore::orchestration::GeneticOptimisation.
|
pure virtual |
Notifies the strategy that we enter a BSP section.
Implemented in benchmarks::exahype2::ccz4::MulticoreOrchestration, tarch::multicore::orchestration::AllOnGPU, and tarch::multicore::orchestration::Hardcoded.
|
staticconstexpr |
Definition at line 70 of file Strategy.h.