Vectorization entails changes in the order of operations within a loop, since each SIMD instruction operates on several data elements at once. Vectorization possible but seems inefficient, although indirect addressing may also result in the following report:
The typical message from the vectorization report is: ! indirect addressing of x X using index array INDXĭo I=1,SIZE,2 B(I)=B(I)+(A(I)*X(INDX(I))) end do ! inner loop accesses a A with stride SIZE This small application multiplies a vector by a matrix using the following loop:ĭo I=1,SIZE,2 B(I)=B(I)+(A(I)*X(I)) end do bin directory and use the attribute appropriate for the architecture. On Linux* and OS X*: Source an environment script such as Start menu item for your Intel product, select an icon underĬompiler and Performance Libraries > Command Prompt with Intel Compiler Open an Intel® Compiler command line window. How significant is the performance enhancement? To evaluate performance enhancement yourself, run Opt-report-phase option by itself or along with the To get details about the type of loop transformations and optimizations that took place, use the The source line number (38 in the above example) refers to either the beginning or the end of the loop. LOOP BEGIN at C:\Projects\vec_samples\matvec.f90(38,6)
#INTEL C COMPILER VECTORIZATION OPTION CODE#
However, it is useful to note that in some cases, certain keywords or directives may be applied in the code for auto-vectorization to occur. This process is referred to as auto-vectorization only to emphasize that the compiler identifies and optimizes suitable loops on its own, without requiring any special action by you. Because the packed instructions operate on more than one data element at a time, the loop executes more efficiently. SIMD instructions operate on multiple data elements in one instruction and make use of the 128-bit SIMD floating-point registers.Īutomatic vectorization occurs when the Intel® Compiler generates packed SIMD instructions to unroll a loop. Vectorization is the process of converting an algorithm from a scalar implementation, which does an operation one pair of operands at a time, to a vector process where a single instruction can refer to a vector (series of adjacent values) is called vectorization. The vectorizer detects operations in the program that can be done in parallel, and then converts the sequential operations to parallel operations, like one SIMD instruction that processes up to 16 elements, depending on the data type. The automatic vectorizer (also called the auto-vectorizer) is a component of the Intel® Compiler that automatically uses SIMD instructions in the Intel® Streaming SIMD Extensions (Intel® SSE, Intel® SSE2, Intel® SSE3 and Intel® SSE4 Vectorizing Compiler and Media Accelerators), Intel® Supplemental Streaming SIMD Extensions (SSSE3) instruction sets, and Intel® Advanced Vector Extension (Intel® AVX) instruction set. Automatic vectorization is supported on IA-32 and Intel® 64 architectures.