Codelet generation
When optimizing for performance, it is common to write a simple codelet that will just benchmark one provider. IRPF90 can write this codelet for you:
$ irpf90 --codelet <NAME>[:<PRECONDITION>]:<NMAX>
NAME: Name of the IRP entity whose provider is to testPRECONDITION: A space-separated list of other entities to provide before running the benchmarkNMAX: Number of repetitions to improve the accuracy.
Here is an example of the uvwt example.
$ irpf90 --codelet v:t:100000
This will generate the codelet_v.irp.f in which t is provided
before the benchmark is run, and v will be built 100000 times:
program codelet_v
implicit none
integer :: i
double precision :: ticks_0, ticks_1, cpu_0, cpu_1
integer, parameter :: irp_imax = 100000
PROVIDE t
call provide_v
double precision :: irp_rdtsc
call cpu_time(cpu_0)
ticks_0 = irp_rdtsc()
do i=1,irp_imax
call bld_v
enddo
ticks_1 = irp_rdtsc()
call cpu_time(cpu_1)
print *, 'v'
print *, '-----------'
print *, 'Cycles:'
print *, (ticks_1-ticks_0)/dble(irp_imax)
print *, 'Seconds:'
print *, (cpu_1-cpu_0)/dble(irp_imax)
end
Now a new main program has been generated, it can be built using make.
When the run is finished, the number of CPU cycles and the time in seconds
is given for one execution of the provider:
Cycles:
17.6698700000000
Seconds:
7.740000000000000E-009