Programming Models for Intel Xeon Phi applications
Native Model
Intel Xeon Phi coprocessors run a Linux operating system and
support traditional Linux services, including SSH. This allows the user
to run applications directly on an Intel Xeon Phi
coprocessor by compiling a native executable for the MIC architecture
and transferring it to the coprocessor’s virtual filesystem. This
process is called Native Application.
Native codes for the MIC architecture may be run directly on the coprocessor without the involvement of the host.
In native mode, an application is compiled on the host (Xeon) using the compiler switch -mmic to generate code for the MIC architecture. The binary has to be copied to the MIC and running there.
* MIC - Many
Integrated Core architecture, used interchangeably with the terms
“coprocessor”, “device” and “target” to indicate the Intel Xeon Phi
coprocessor, as opposed to the Intel Xeon processor.
- Suitability for Native Execution
Native execution occurs when an application runs entirely on an Intel Xeon Phi coprocessor. Building a native application is
a fast way to get existing software running with minimal code changes.
First, ensure that the application is suitable for native execution. Data parallelism, usage of parallel algorithms, and application scalability are criteria for targeting Intel Xeon Phi coprocessors, but not for distinguishing between the usage of offload or native mode. An application likely to benefit from the large number of cores available with native execution tends to have the following characteristics:
- A modest memory footprint, less than the available physical memory on the device.
- Very few serial segments.
- Does not perform extensive I/O.
- A complex code structure with no well-identified hot kernels that could be offloaded without substantial data transfer overhead.
Offload (heterogeneous) Model
It is also possible to develop applications so that they run on the host and employ the MIC architecture by transferring only some of the
data and functions to the coprocessors. The process of data and
code transfer to the coprocessor is generally called offload, and
applications using this procedure are known as offload
applications.
Offload code run on the Xeon Processor
and employ the MIC architecture by transferring only some of the data
and functions to the coprocessors.
The source code in C language in Figure 1 demonstrates offloading a
section of the program to Intel Xeon Phi coprocessor using #pragma
offload.
When the Intel compiler encounters an #pragma offload, it generates
code for both, the Phi coprocessor and Xeon Processor. Code to transfer
the data to the Phi coprocessor is automatically created by the
compiler, however the programmer can influence the data transfer by
adding data clauses to the offload pragma.
Figure 1. Source code of hello-offload.cpp example with the offload segment to be executed on Intel Xeon
Phi coprocessor.
#pragma offload target(mic)
indicates that the following segment of the code should be executed on an Intel
Xeon Phi coprocessor (i.e., “offloaded”). This application must be
compiled as a usual host application: no additional compiler arguments
are necessary in order to compile offload applications.
Offload model may run only on Xeon Processor if there is no #pragma
offload inside the code, also, this kind of application does not
need to be copied to the Phi coprocessor.
- Suitability for Offload Execution
- Better serial processing.
- Makes fuller use of available resources.