StarPU Handbook - StarPU Basics
|
Data Structures | |
struct | starpu_conf |
Macros | |
#define | STARPU_THREAD_ACTIVE |
struct starpu_conf |
Structure passed to the starpu_init() function to configure StarPU. It has to be initialized with starpu_conf_init(). When the default value is used, StarPU automatically selects the number of processing units and takes the default scheduling policy. The environment variables overwrite the equivalent parameters unless starpu_conf::precedence_over_environment_variables is set.
|
private |
Will be initialized by starpu_conf_init(). Should not be set by hand.
|
private |
Tell starpu_init() if MPI will be initialized later.
const char* starpu_conf::sched_policy_name |
Name of the scheduling policy. This can also be specified with the environment variable STARPU_SCHED. (default = NULL
).
struct starpu_sched_policy* starpu_conf::sched_policy |
Definition of the scheduling policy. This field is ignored if starpu_conf::sched_policy_name is set. (default = NULL
)
void(* starpu_conf::sched_policy_callback) (unsigned) |
Callback function that can later be used by the scheduler. The scheduler can retrieve this function by calling starpu_sched_ctx_get_sched_policy_callback()
int starpu_conf::precedence_over_environment_variables |
For all parameters specified in this structure that can also be set with environment variables, by default, StarPU chooses the value of the environment variable against the value set in starpu_conf. Setting the parameter starpu_conf::precedence_over_environment_variables to 1 allows to give precedence to the value set in the structure over the environment variable.
int starpu_conf::ncpus |
Number of CPU cores that StarPU can use. This can also be specified with the environment variable STARPU_NCPU. (default = -1
)
int starpu_conf::reserve_ncpus |
Number of CPU cores to that StarPU should leave aside. They can then be used by application threads, by calling starpu_get_next_bindid() to get their ID, and starpu_bind_thread_on() to bind the current thread to them.
int starpu_conf::ncuda |
Number of CUDA devices that StarPU can use. This can also be specified with the environment variable STARPU_NCUDA. (default = -1
)
int starpu_conf::nhip |
Number of HIP devices that StarPU can use. This can also be specified with the environment variable STARPU_NHIP. (default = -1
)
int starpu_conf::nopencl |
Number of OpenCL devices that StarPU can use. This can also be specified with the environment variable STARPU_NOPENCL. (default = -1
)
int starpu_conf::nmax_fpga |
Number of Maxeler FPGA devices that StarPU can use. This can also be specified with the environment variable STARPU_NMAX_FPGA. (default = -1)
int starpu_conf::nmpi_ms |
Number of MPI Master Slave devices that StarPU can use. This can also be specified with the environment variable STARPU_NMPI_MS. (default = -1
)
int starpu_conf::ntcpip_ms |
Number of TCP/IP Master Slave devices that StarPU can use. This can also be specified with the environment variable STARPU_NTCPIP_MS. (default = -1
)
unsigned starpu_conf::use_explicit_workers_bindid |
If this flag is set, the starpu_conf::workers_bindid array indicates where the different workers are bound, otherwise StarPU automatically selects where to bind the different workers. This can also be specified with the environment variable STARPU_WORKERS_CPUID. (default = 0
)
unsigned starpu_conf::workers_bindid[STARPU_NMAXWORKERS] |
If the starpu_conf::use_explicit_workers_bindid flag is set, this array indicates where to bind the different workers. The i-th entry of the starpu_conf::workers_bindid indicates the logical identifier of the processor which should execute the i-th worker. Note that the logical ordering of the CPUs is either determined by the OS, or provided by the hwloc
library in case it is available.
unsigned starpu_conf::use_explicit_workers_cuda_gpuid |
If this flag is set, the CUDA workers will be attached to the CUDA devices specified in the starpu_conf::workers_cuda_gpuid array. Otherwise, StarPU affects the CUDA devices in a round-robin fashion. This can also be specified with the environment variable STARPU_WORKERS_CUDAID. (default = 0
)
unsigned starpu_conf::workers_cuda_gpuid[STARPU_NMAXWORKERS] |
If the starpu_conf::use_explicit_workers_cuda_gpuid flag is set, this array contains the logical identifiers of the CUDA devices (as used by cudaGetDevice()
).
unsigned starpu_conf::use_explicit_workers_hip_gpuid |
If this flag is set, the HIP workers will be attached to the HIP devices specified in the starpu_conf::workers_hip_gpuid array. Otherwise, StarPU affects the HIP devices in a round-robin fashion. This can also be specified with the environment variable STARPU_WORKERS_HIPID. (default = 0
)
unsigned starpu_conf::workers_hip_gpuid[STARPU_NMAXWORKERS] |
If the starpu_conf::use_explicit_workers_hip_gpuid flag is set, this array contains the logical identifiers of the HIP devices (as used by hipGetDevice()
).
unsigned starpu_conf::use_explicit_workers_opencl_gpuid |
If this flag is set, the OpenCL workers will be attached to the OpenCL devices specified in the starpu_conf::workers_opencl_gpuid array. Otherwise, StarPU affects the OpenCL devices in a round-robin fashion. This can also be specified with the environment variable STARPU_WORKERS_OPENCLID. (default = 0
)
unsigned starpu_conf::workers_opencl_gpuid[STARPU_NMAXWORKERS] |
If the starpu_conf::use_explicit_workers_opencl_gpuid flag is set, this array contains the logical identifiers of the OpenCL devices to be used.
unsigned starpu_conf::use_explicit_workers_max_fpga_deviceid |
If this flag is set, the Maxeler FPGA workers will be attached to the Maxeler FPGA devices specified in the starpu_conf::workers_max_fpga_deviceid array. Otherwise, StarPU affects the Maxeler FPGA devices in a round-robin fashion. This can also be specified with the environment variable STARPU_WORKERS_MAX_FPGAID. (default = 0)
unsigned starpu_conf::workers_max_fpga_deviceid[STARPU_NMAXWORKERS] |
If the starpu_conf::use_explicit_workers_max_fpga_deviceid flag is set, this array contains the logical identifiers of the Maxeler FPGA devices to be used.
struct starpu_max_load* starpu_conf::max_fpga_load |
This allows to specify the Maxeler file(s) to be loaded on Maxeler FPGAs. This is an array of starpu_max_load, the last of which shall have file set to NULL. In order to use all available devices, starpu_max_load::engine_id_pattern can be set to "*", but only the last non-NULL entry can be set so.
If this is not set, it is assumed that the basic static SLiC interface is used.
unsigned starpu_conf::use_explicit_workers_mpi_ms_deviceid |
If this flag is set, the MPI Master Slave workers will be attached to the MPI Master Slave devices specified in the array starpu_conf::workers_mpi_ms_deviceid. Otherwise, StarPU affects the MPI Master Slave devices in a round-robin fashion. (default = 0
)
unsigned starpu_conf::workers_mpi_ms_deviceid[STARPU_NMAXWORKERS] |
If the flag starpu_conf::use_explicit_workers_mpi_ms_deviceid is set, the array contains the logical identifiers of the MPI Master Slave devices to be used.
int starpu_conf::bus_calibrate |
If this flag is set, StarPU will recalibrate the bus. If this value is equal to -1, the default value is used. This can also be specified with the environment variable STARPU_BUS_CALIBRATE. (default = 0
)
int starpu_conf::calibrate |
If this flag is set, StarPU will calibrate the performance models when executing tasks. If this value is equal to -1, the default value is used. If the value is equal to 1, it will force continuing calibration. If the value is equal to 2, the existing performance models will be overwritten. This can also be specified with the environment variable STARPU_CALIBRATE. (default = 0
)
int starpu_conf::data_locality_enforce |
This flag should be set to 1 to enforce data locality when choosing a worker to execute a task. This can also be specified with the environment variable STARPU_DATA_LOCALITY_ENFORCE. This can also be specified at compilation time by giving to the configure script the option --enable-data-locality-enforce. (default = 0
)
int starpu_conf::single_combined_worker |
By default, StarPU executes parallel tasks concurrently. Some parallel libraries (e.g. most OpenMP implementations) however do not support concurrent calls to parallel code. In such case, setting this flag makes StarPU only start one parallel task at a time (but other CPU and GPU tasks are not affected and can be run concurrently). The parallel task scheduler will however still try varying combined worker sizes to look for the most efficient ones. This can also be specified with the environment variable STARPU_SINGLE_COMBINED_WORKER. (default = 0
)
int starpu_conf::disable_asynchronous_copy |
This flag should be set to 1 to disable asynchronous copies between CPUs and all accelerators. The AMD implementation of OpenCL is known to fail when copying data asynchronously. When using this implementation, it is therefore necessary to disable asynchronous data transfers. This can also be specified with the environment variable STARPU_DISABLE_ASYNCHRONOUS_COPY. This can also be specified at compilation time by giving to the configure script the option --disable-asynchronous-copy. (default = 0
)
int starpu_conf::disable_asynchronous_cuda_copy |
This flag should be set to 1 to disable asynchronous copies between CPUs and CUDA accelerators. This can also be specified with the environment variable STARPU_DISABLE_ASYNCHRONOUS_CUDA_COPY. This can also be specified at compilation time by giving to the configure script the option --disable-asynchronous-cuda-copy. (default = 0
)
int starpu_conf::disable_asynchronous_hip_copy |
This flag should be set to 1 to disable asynchronous copies between CPUs and HIP accelerators. This can also be specified with the environment variable STARPU_DISABLE_ASYNCHRONOUS_HIP_COPY. This can also be specified at compilation time by giving to the configure script the option --disable-asynchronous-hip-copy. (default = 0
)
int starpu_conf::disable_asynchronous_opencl_copy |
This flag should be set to 1 to disable asynchronous copies between CPUs and OpenCL accelerators. The AMD implementation of OpenCL is known to fail when copying data asynchronously. When using this implementation, it is therefore necessary to disable asynchronous data transfers. This can also be specified with the environment variable STARPU_DISABLE_ASYNCHRONOUS_OPENCL_COPY. This can also be specified at compilation time by giving to the configure script the option --disable-asynchronous-opencl-copy. (default = 0
)
int starpu_conf::disable_asynchronous_mpi_ms_copy |
This flag should be set to 1 to disable asynchronous copies between CPUs and MPI Master Slave devices. This can also be specified with the environment variable STARPU_DISABLE_ASYNCHRONOUS_MPI_MS_COPY. This can also be specified at compilation time by giving to the configure script the option --disable-asynchronous-mpi-master-slave-copy. (default = 0
).
int starpu_conf::disable_asynchronous_tcpip_ms_copy |
This flag should be set to 1 to disable asynchronous copies between CPUs and TCP/IP Master Slave devices. This can also be specified with the environment variable STARPU_DISABLE_ASYNCHRONOUS_TCPIP_MS_COPY. This can also be specified at compilation time by giving to the configure script the option --disable-asynchronous-tcpip-master-slave-copy. (default = 0
).
int starpu_conf::disable_asynchronous_max_fpga_copy |
This flag should be set to 1 to disable asynchronous copies between CPUs and Maxeler FPGA devices. This can also be specified with the environment variable STARPU_DISABLE_ASYNCHRONOUS_MAX_FPGA_COPY. This can also be specified at compilation time by giving to the configure script the option --disable-asynchronous-fpga-copy. (default = 0).
int starpu_conf::enable_map |
This flag should be set to 1 to disable memory mapping support between memory nodes. This can also be specified with the environment variable STARPU_ENABLE_MAP.
unsigned* starpu_conf::cuda_opengl_interoperability |
Enable CUDA/OpenGL interoperation on these CUDA devices. This can be set to an array of CUDA device identifiers for which cudaGLSetGLDevice()
should be called instead of cudaSetDevice()
. Its size is specified by the starpu_conf::n_cuda_opengl_interoperability field below (default = NULL
)
unsigned starpu_conf::n_cuda_opengl_interoperability |
Size of the array starpu_conf::cuda_opengl_interoperability
struct starpu_driver* starpu_conf::not_launched_drivers |
Array of drivers that should not be launched by StarPU. The application will run in one of its own threads. (default = NULL
)
unsigned starpu_conf::n_not_launched_drivers |
The number of StarPU drivers that should not be launched by StarPU, i.e number of elements of the array starpu_conf::not_launched_drivers. (default = 0
)
uint64_t starpu_conf::trace_buffer_size |
Specify the buffer size used for FxT tracing. Starting from FxT version 0.2.12, the buffer will automatically be flushed when it fills in, but it may still be interesting to specify a bigger value to avoid any flushing (which would disturb the trace).
int starpu_conf::global_sched_ctx_min_priority |
Set the minimum priority used by priorities-aware schedulers. This also can be specified with the environment variable STARPU_MIN_PRIO
int starpu_conf::global_sched_ctx_max_priority |
Set the maximum priority used by priorities-aware schedulers. This also can be specified with the environment variable STARPU_MAX_PRIO
int starpu_conf::catch_signals |
Specify if StarPU should catch SIGINT
, SIGSEGV
and SIGTRAP
signals to make sure final actions (e.g dumping FxT trace files) are done even though the application has crashed. By default (value = 1
), signals are caught. It should be disabled on systems which already catch these signals for their own needs (e.g JVM) This can also be specified with the environment variable STARPU_CATCH_SIGNALS.
unsigned starpu_conf::start_perf_counter_collection |
Specify whether StarPU should automatically start to collect performance counters after initialization
unsigned starpu_conf::driver_spinning_backoff_min |
Minimum spinning backoff of drivers (default = 1
)
unsigned starpu_conf::driver_spinning_backoff_max |
Maximum spinning backoff of drivers. (default = 32
)
int starpu_conf::cuda_only_fast_alloc_other_memnodes |
Specify if CUDA workers should do only fast allocations when running the datawizard progress of other memory nodes. This will pass the interval value _STARPU_DATAWIZARD_ONLY_FAST_ALLOC to the allocation method. Default value is 0, allowing CUDA workers to do slow allocations. This can also be specified with the environment variable STARPU_CUDA_ONLY_FAST_ALLOC_OTHER_MEMNODES.
#define STARPU_THREAD_ACTIVE |
Value to be passed to starpu_get_next_bindid() and starpu_bind_thread_on() when binding a thread which will significantly eat CPU time, and should thus have its own dedicated CPU.
int starpu_conf_init | ( | struct starpu_conf * | conf | ) |
Initialize the conf
structure with the default values. In case some configuration parameters are already specified through environment variables, starpu_conf_init() initializes the fields of conf
according to the environment variables. For instance if STARPU_CALIBRATE is set, its value is put in the field starpu_conf::calibrate of conf
. Upon successful completion, this function returns 0. Otherwise, -EINVAL
indicates that the argument was NULL
.
int starpu_conf_noworker | ( | struct starpu_conf * | conf | ) |
Set fields of conf
so that no worker is enabled, i.e. set starpu_conf::ncpus = 0, starpu_conf::ncuda = 0, etc.
This allows to portably enable only a given type of worker:
starpu_conf_noworker(&conf);
conf.ncpus = -1;
See ConfigurationAndInitialization for more details.
int starpu_init | ( | struct starpu_conf * | conf | ) |
StarPU initialization method, must be called prior to any other StarPU call. It is possible to specify StarPU’s configuration (e.g. scheduling policy, number of cores, ...) by passing a non-NULL
conf
. Default configuration is used if conf
is NULL
. Upon successful completion, this function returns 0. Otherwise, -ENODEV
indicates that no worker was available (and thus StarPU was not initialized). See Submitting A Task for more details.
int starpu_initialize | ( | struct starpu_conf * | user_conf, |
int * | argc, | ||
char *** | argv | ||
) |
Similar to starpu_init(), but also take the argc
and argv
as defined by the application, which is necessary when running in Simgrid mode or MPI Master Slave mode. Do not call starpu_init() and starpu_initialize() in the same program. See Submitting A Task for more details.
int starpu_is_initialized | ( | void | ) |
Return 1 if StarPU is already initialized. See ConfigurationAndInitialization for more details.
void starpu_wait_initialized | ( | void | ) |
Wait for starpu_init() call to finish. See ConfigurationAndInitialization for more details.
void starpu_shutdown | ( | void | ) |
StarPU termination method, must be called at the end of the application: statistics and other post-mortem debugging information are not guaranteed to be available until this method has been called. See Submitting A Task for more details.
void starpu_pause | ( | void | ) |
Suspend the processing of new tasks by workers. It can be used in a program where StarPU is used during only a part of the execution. Without this call, the workers continue to poll for new tasks in a tight loop, wasting CPU time. The symmetric call to starpu_resume() should be used to unfreeze the workers. See Kernel Threads Started by StarPU and PauseResume for more details.
void starpu_resume | ( | void | ) |
Symmetrical call to starpu_pause(), used to resume the workers polling for new tasks. This would be typically called only once having submitted all tasks. See Kernel Threads Started by StarPU and PauseResume for more details.
int starpu_is_paused | ( | void | ) |
Return !0 if task processing by workers is currently paused, 0 otherwise. See StarPUEatsCPUs for more details.
unsigned starpu_get_next_bindid | ( | unsigned | flags, |
unsigned * | preferred, | ||
unsigned | npreferred | ||
) |
Return a PU binding ID which can be used to bind threads with starpu_bind_thread_on(). flags
can be set to STARPU_THREAD_ACTIVE or 0. When npreferred
is set to non-zero, preferred
is an array of size npreferred
in which a preference of PU binding IDs can be set. By default StarPU will return the first PU available for binding. See Kernel Threads Started by StarPU and cpuWorkers for more details.
int starpu_bind_thread_on | ( | int | cpuid, |
unsigned | flags, | ||
const char * | name | ||
) |
Bind the calling thread on the given cpuid
(which should have been obtained with starpu_get_next_bindid()).
Return -1 if a thread was already bound to this PU (but binding will still have been done, and a warning will have been printed), so the caller can tell the user how to avoid the issue.
name
should be set to a unique string so that different calls with the same name for the same cpuid
does not produce a warning.
See Kernel Threads Started by StarPU and cpuWorkers for more details.
void starpu_bind_thread_on_worker | ( | unsigned | workerid | ) |
Bind the calling thread on the cores corresponding to the workerid
.
workerid
can be a basic worker or a combined worker.
This can be used e.g. before initializing a library which records at initialization time the thread binding to be used when running kernels.
See Kernel Threads Started by StarPU and cpuWorkers for more details.
void starpu_bind_thread_on_main | ( | void | ) |
Bind the calling thread back to the core reserved for the main thread.
This can be used e.g. after initializing a library which records at initialization time the thread binding to be used when running kernels.
See Kernel Threads Started by StarPU and cpuWorkers for more details.
void starpu_bind_thread_on_cpu | ( | int | cpuid | ) |
Bind the calling thread on the given cpuid
This can be used e.g. after initializing a library which records at initialization time the thread binding to be used when running kernels.
See Kernel Threads Started by StarPU and cpuWorkers for more details.
int starpu_cpu_os_index | ( | int | cpuid | ) |
Return the OS number of a given cpuid
StarPU uses logical numbering (as define by hwloc) all along, but in case interaction is needed with another binding tool that uses numbering as defined by the OS, we need to convert from hwloc logical numbering to hwloc physical numbering.
void starpu_topology_print | ( | FILE * | f | ) |
Print a description of the topology on f
. See ConfigurationAndInitialization for more details.
int starpu_asynchronous_copy_disabled | ( | void | ) |
Return 1 if asynchronous data transfers between CPU and accelerators are disabled. See Basic for more details.
int starpu_asynchronous_cuda_copy_disabled | ( | void | ) |
Return 1 if asynchronous data transfers between CPU and CUDA accelerators are disabled. See cudaWorkers for more details.
int starpu_asynchronous_hip_copy_disabled | ( | void | ) |
Return 1 if asynchronous data transfers between CPU and HIP accelerators are disabled. See hipWorkers for more details.
int starpu_asynchronous_opencl_copy_disabled | ( | void | ) |
Return 1 if asynchronous data transfers between CPU and OpenCL accelerators are disabled. See openclWorkers for more details.
int starpu_asynchronous_max_fpga_copy_disabled | ( | void | ) |
Return 1 if asynchronous data transfers between CPU and Maxeler FPGA devices are disabled. See maxfpgaWorkers for more details.
int starpu_asynchronous_mpi_ms_copy_disabled | ( | void | ) |
Return 1 if asynchronous data transfers between CPU and MPI Slave devices are disabled. See mpimsWorkers for more details.
int starpu_asynchronous_tcpip_ms_copy_disabled | ( | void | ) |
Return 1 if asynchronous data transfers between CPU and TCP/IP Slave devices are disabled. See tcpipmsWorkers for more details.
int starpu_asynchronous_copy_disabled_for | ( | enum starpu_node_kind | kind | ) |
Return 1 if asynchronous data transfers with a given kind of memory are disabled.
int starpu_map_enabled | ( | void | ) |
Return 1 if memory mapping support between memory nodes is enabled. See Basic for more details.
void starpu_display_stats | ( | void | ) |
Call starpu_profiling_bus_helper_display_summary() and starpu_profiling_worker_helper_display_summary(). See DataStatistics for more details.