1. Introduction
This guide describes how the Zbook15 platform file example is organized. This example may not be optimal but can still help you to build your own platform files.
2. Zbook15 platform parameters
2.1. compiler_set
These parameters define available compiler suites and associated commands and modules.
Parameter | Description | Value example |
---|---|---|
compv |
Enable the benchmark to select one or multiple compiler versions |
0,1 |
comp_version |
Name associated with the compiler suite |
[gnu,<other_compiler>][$comp_v] |
cc |
C/C++ compiler command |
["gfortran","<?>"][$comp_v] |
cflags |
Default C/C++ compilation flags |
-O2 |
fflags |
Default Fortran compilation flags |
-O2 |
cflags_opt |
Aggresive C optimization flags |
["-O3 -march=native","?"][$comp_v] |
fflags_opt |
Aggresive Fortran optimization flags |
["-O3 -march=native","?"][$comp_v] |
module_compile |
Modules that are needed to compile |
|
module_blas |
Modules that provide Blas library |
|
blas_root |
Blas Root directory |
<Needs a local blas installation path> |
2.2. mpi_set
These parameters define available MPI libraries and associated commands and modules.
Parameter | Description | Value example |
---|---|---|
mpiv |
Enable the benchmark to select one or multiple compiler versions |
0,1 |
mpi_version |
Name associated with the compiler suite |
[OpenMPI-1.6.5,<other_mpi>][$comp_v] |
mpi_cc |
C compiler MPI wrapper command |
mpicc |
mpi_cxx |
C++ compiler MPI wrapper command |
mpic++ |
mpi_f90 |
Fortran compiler MPI wrapper command |
mpif90 |
module_mpi |
Module providing MPI |
|
binding_full_node |
-n $tasks |
Optimal binding when using one task per core |
binding_half_node |
--mca rmaps_base_schedule_policy slot --bind-to-core -cpus-per-rank 2 -num-sockets 1 --npersocket 2 -n $tasks |
Optimal binding when using two times less tasks than available cores. |
binding_hybrid |
--mca rmaps_base_schedule_policy slot --bind-to-core -cpus-per-rank 2 -num-sockets 1 --npersocket 2 -n $tasks |
Optimal binding when using two times less tasks than available cores and two threads per task |
binding_stream |
--mca rmaps_base_schedule_policy slot --bind-to-core -cpus-per-rank 4 -num-sockets 1 --npersocket 1 -n $tasks |
Optimal binding when using one task per socket filling socket cores with threads. |
2.3. cuda_set
These parameters define available Nvidia CUDA related libraries and moudles.
Parameter | Description | Value example |
---|---|---|
cuda_tlk_v |
Enable the benchmark to select one or multiple CUDA toolkit versions |
0,1 |
cudnn_v |
Enable the benchmark to select one or multiple CUDA Deep Neural Network (CuDNN) library versions |
0,1 |
module_mpi |
Module providing CUDA toolkit |
|
module_mpi |
Module providing CuDNN library |
2.4. execute_set
Define batch system dependent parameters. As Zbook15 does not use a job scheduler, a Slurm cluster execute_set is shown as an example.
2.4.1. Zbook15
Parameter | Description | Value example |
---|---|---|
submit |
Command to submit jobs |
bash |
submit_singleton |
Command to submit singleton jobs |
(See Slurm Cluster for an example) |
submit_script |
Name of the batch script template |
job.submit |
starter |
Launcher command |
mpirun |
args_starter |
Optionals starter arguments that may be set by the benchmark |
2.4.2. Slurm Cluster
Parameter | Description | Value example |
---|---|---|
submit |
Command to submit jobs |
sbatch |
submit_singleton |
Command to submit singleton jobs |
sbatch --dependency=singleton |
submit_script |
Name of the batch script template |
job.submit |
starter |
Launcher command |
srun |
args_starter |
Optionals starter arguments that may be set by the benchmark |
2.5. cluster_specs
This parameter set defines platform hardware specifications.
Parameter | Description | Value example |
---|---|---|
platform_name |
Name of the platform |
Zbook15 |
GB_per_node |
Available RAM per node (GB) |
16 |
MB_LLC_size |
Size of the last level of cache (MB |
6 |
LLC_cache_line_size |
Size of LLC cache line (B) |
64 |
NUMA_regions |
Number of non uniform memory access regions |
1 |
core_per_NUMA_region |
Number of cores per NUMA region |
4 |
2.6. system_parameters
These parameters define default values used to fill the job template file. Values are most of the times reset by benchmark description file.
Parameter | Description | Value example |
---|---|---|
nodes |
Default number of nodes, set by benchmarks and modified by the -w option |
1 |
taskspernode |
Default number of tasks per node |
4 |
threadspertask |
Default number of threads per task |
1 |
tasks |
Number of tasks |
$nodes * $taskspernode |
OMP_NUM_THREADS |
Number of OpenMP threads |
$threadspertask |
executable, args_exec,mail, .. |
See job.submit template to understand theses parameters |
2.7. execute_sub / jobfiles
jobfiles defines the job submission script template file. execute_sub describes all the subsitutions that need to be done to get a complete submission script from the template.