Codelet generation
When optimizing for performance, it is common to write a simple codelet that will just benchmark one provider. IRPF90 can write this codelet for you:
$ irpf90 --codelet <NAME>[:<PRECONDITION>]:<NMAX>
NAME
: Name of the IRP entity whose provider is to testPRECONDITION
: A space-separated list of other entities to provide before running the benchmarkNMAX
: Number of repetitions to improve the accuracy.
Here is an example of the uvwt
example.
$ irpf90 --codelet v:t:100000
This will generate the codelet_v.irp.f
in which t
is provided
before the benchmark is run, and v
will be built 100000 times:
program codelet_v
implicit none
integer :: i
double precision :: ticks_0, ticks_1, cpu_0, cpu_1
integer, parameter :: irp_imax = 100000
PROVIDE t
call provide_v
double precision :: irp_rdtsc
call cpu_time(cpu_0)
ticks_0 = irp_rdtsc()
do i=1,irp_imax
call bld_v
enddo
ticks_1 = irp_rdtsc()
call cpu_time(cpu_1)
print *, 'v'
print *, '-----------'
print *, 'Cycles:'
print *, (ticks_1-ticks_0)/dble(irp_imax)
print *, 'Seconds:'
print *, (cpu_1-cpu_0)/dble(irp_imax)
end
Now a new main program has been generated, it can be built using make
.
When the run is finished, the number of CPU cycles and the time in seconds
is given for one execution of the provider:
Cycles:
17.6698700000000
Seconds:
7.740000000000000E-009