I have some code that when is compiled with gfortran, a tail call optimization is done, but when compiled with ifort, it isn't.
Here are some very simple snips describing the issue:
Therefore, the code
function f(a, b)
real, intent(in) :: a, b
real :: f
f = a**b
end function f
is compiled using gfortran to:
f_:
movss xmm1, DWORD PTR [rsi]
movss xmm0, DWORD PTR [rdi]
jmp powf
and using ifort to:
f_:
push rsi #1.10
movss xmm0, DWORD PTR [rdi] #4.10
movss xmm1, DWORD PTR [rsi] #4.10
call powf #4.10
pop rcx #5.1
ret #5.1
I expected both compilers to do the tail call optimization, and I used -O3 compiler flag in both.
I looked online but couldn't find a similar issue. I only found this: Does gfortran support tail call elimination? but here gfortran fails to do the optimization and the code is much more complicated.
Why doesn't ifort do the tail call optimization?
[EDIT]: vtune output of my original code (not the simplified version above)
[EDIT]: Here is a usage example of f:
program calling_func
real :: res
res = 0.d0
do i = 1, 10000
res = res + f(real(mod(i,3)), real(mod(i,3)))
end do
Print *, res
end program calling_func
function f(a, b)
implicit none
real, intent(in) :: a, b
real :: f
f = a**b
end function f
on gfortran, the line f = a**b is converted in assemebly to jmp pow while in ifort to call pow.