Codelet generation

When optimizing for performance, it is common to write a simple codelet that will just benchmark one provider. IRPF90 can write this codelet for you:

$ irpf90 --codelet <NAME>[:<PRECONDITION>]:<NMAX>
  • NAME : Name of the IRP entity whose provider is to test
  • PRECONDITION : A space-separated list of other entities to provide before running the benchmark
  • NMAX : Number of repetitions to improve the accuracy.

Here is an example of the uvwt example.

$ irpf90 --codelet v:t:100000

This will generate the codelet_v.irp.f in which t is provided before the benchmark is run, and v will be built 100000 times:

program codelet_v
  implicit none
  integer :: i
  double precision :: ticks_0, ticks_1, cpu_0, cpu_1
  integer, parameter :: irp_imax = 100000

  PROVIDE t
  call provide_v
  double precision :: irp_rdtsc

  call cpu_time(cpu_0)
  ticks_0 = irp_rdtsc()
  do i=1,irp_imax
    call bld_v
  enddo
  ticks_1 = irp_rdtsc()
  call cpu_time(cpu_1)
  print *, 'v'
  print *, '-----------'
  print *, 'Cycles:'
  print *,  (ticks_1-ticks_0)/dble(irp_imax)
  print *, 'Seconds:'
  print *,  (cpu_1-cpu_0)/dble(irp_imax)
end

Now a new main program has been generated, it can be built using make. When the run is finished, the number of CPU cycles and the time in seconds is given for one execution of the provider:

 Cycles:
   17.6698700000000     
 Seconds:
  7.740000000000000E-009