Array alignment
Array alignment is necessary to get performance on x86 architectures. Indeed, vector instructions (SSE,AVX,AVX-512) require the data to be aligned on a 16-, 32- or 64-byte boundary. With the Intel compiler, it is possible to give the compiler a directive to align an array on a given boundary:
!DIR$ ATTRIBUTES ALIGN : 32 :: X
Doing this will force the first element of array X
to have an address
which is a multiple of 256 bits. Using aligned arrays for one-dimensional
array will remove the peeling loops produced by the compiler when
producing and auto-vectorized binary.
For two-dimensional arrays, it is possible to have all columns aligned if the array is aligned and the length of a column is a multiple of the alignment.
IRPF90 can set the alignment directive for all the IRP entities that are arrays using a command-line argument:
irpf90 --align=32
will use a 32 byte alignment for every array entity, but it will also
replace in the code all the $IRP_ALIGN
patterns with 32
.
In this way, it is possible to make a code which is valid for all
kind of array alignments.
Let's create a function that will calculate the length of the leading dimension such that it is a multiple of the alignment:
integer function align_double(i)
implicit none
integer, intent(in) :: i
integer :: j
j = mod(i,max($IRP_ALIGN,4)/4)
if (j==0) then
align_double = i
else
align_double = i+4-j
endif
end
We can now create a matrix with all columns aligned, using the
!DIR$ VECTOR ALIGNED
directive safely.
BEGIN_PROVIDER [ integer, n ]
&BEGIN_PROVIDER [ integer, n_aligned ]
integer :: align_double
n = 19
n_aligned = align_double(19)
END_PROVIDER
BEGIN_PROVIDER [ double precision, Matrix, (n_aligned,n) ]
implcit none
integer :: i,j
do j=1,n
!DIR$ VECTOR ALIGNED
do i=1,n_aligned
! do stuff to create Matrix(i,j)
enddo
enddo
END_PROVIDER