Running offload application on Xeon-Phi


It is possible to generate diagnostic output for offload applications that utilize Intel Xeon Phi coprocessors.

The Linux environment variable OFFLOAD_REPORT controls the verbosity of the diagnostic output:

a) When this variable is not set or has the value 0, no diagnostic output is produced.

b) Setting OFFLOAD_REPORT=1 produces output including the offload locations and times.

c) Setting OFFLOAD_REPORT=2, in addition, produces information regarding data traffic.

Environment variables defined on the host are automatically forwarded to the coprocessor when an offload application is launched. By default, variable names are not changed on the coprocessor.

In order to avoid environment variable name collisions on the host and the coprocessor, the user can utilize the environment variable MIC_ENV_PREFIX. When this variable is set, only environment variables with names beginning with the specified prefix are be forwarded, and the prefix is stripped on the coprocessor.

Note that the variable MIC_LD_LIBRARY_PATH is an exception. It is never passed to the coprocessor with its suffix stripped, so it is not possible to change the value of LD_LIBRARY_PATH on the coprocessor using environment forwarding.

When an offload application is launched, the computation can be split on the processor and coprocessor. In this case the environment variable that defines the number of threads, OMP_NUM_THREADS, is the same for both, but if you want a different number of threads running in the coprocessor, you can set the environment variable as follow:

•    export MIC_ENV_PREFIX=MIC
•    export MIC_OMP_NUM_THREADS=240
•    export OMP_NUM_THREADS=32

In this example, we will have 240 threads running in the coprocessor and 32 threads running in the processor.

Thread affinity in OpenMP applications can be controlled at the application level by setting the environment variable KMP_AFFINITY.

The format of the variable is


Note that in order to use different affinity masks on the host and on coprocessors with offload applications, the environment variable MIC_ENV_PREFIX  may be set. For example, the setup below  results in affinity type compact  on the host and balanced  on the coprocessor.

In order to lear more about thread affinity on Intel Xeon-Phi, use this link:

Thread affinity on Intel Xeon-Phi