Why do I get a segmentation fault problem in this program when the matrix dimension is too large (in ifort)?

146 Views Asked by At

I'm making basic linear algebra computations with matrices and vectors in Fortran. I have changed compiler from gfortran to ifort and I've found that when my matrices get too large (specifically when they are of size 724 x 724, of type complex double) I get the following error (segmentation fault):

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
dummy              000000000040C9DA  Unknown               Unknown  Unknown
libpthread-2.28.s  00007FB2CAB57CE0  Unknown               Unknown  Unknown
dummy              0000000000403D57  Unknown               Unknown  Unknown
dummy              0000000000402B22  Unknown               Unknown  Unknown
libc-2.28.so       00007FB2CA7BACF3  __libc_start_main     Unknown  Unknown
dummy              0000000000402A2E  Unknown               Unknown  Unknown

I've managed to boil the problem down to this minimal program:

program dummy

  use ifport
  use iso_c_binding, dp => c_double, ip => c_int, dcp => c_double_complex
  implicit none

  integer (ip)     ::  dim 
  complex(dcp), dimension(:,:), allocatable :: U

  write (*,*) "dim"
  read (*,*) dim 
  print*,""

  print *, "Allocating U."
  allocate(U(dim,dim))

  print *, "dim = ", dim, "dim^2 = ", dim**2, "size(U) = ", size(U)
  print *, "Building U..."
  U = 0 
  print *, "U initialized (set to zero)."

  print *, "Testing matrix multiplication matmul(U, U)"
  U = matmul(U,U)
  print *, "U built."


end program

which is compiled either with ifort dummy.f90 -o dummy or gfortran dummy.f90 -o dummy (ifport is not used with gfortran). Additional flags such as -fpp -check all, bounds -warn all -pedantic do not give additional information on the source of error.

This stops working for dim = 724 with ifort, while it works for much larger sizes with gfortran (I've tested for a few thousands with no problem). The error appears as soon as the matrix multiplication is performed. Indeed, even at dim = 10000 I have no problem allocating the first matrix with both programs, but with ifort I always get that the segmentation fault error (although with gfortran it's very slow, albeit expected a these sizes - I have not checked that the result of matrix multiplication is correct in any case).

Also, the program has been run with two different machines, although I do not expect a memory problem on the one with ifort, as by a simple calculation from free and cat /proc/cpuinfo/ I have 72 processors with ~10GB of memory each (and no one else is using this cluster at the moment). Thus, by a handwaving calculation, a matrix of type complex double would need to have dimension sqrt(10 * 10^9 / 16) ~ 25000 to completely fill the memory of one processor and I'm nowhere near that.

What is the source of the error? As it's quite generic, I haven't managed to understand the reason, as well as the inconsistency between gfortran and ifort. A test with different machines and compiler versions would also be very much appreciated. Thanks.

1

There are 1 best solutions below

2
PierU On BEST ANSWER

The statement U = matmult(U,U) requires a hidden temporary array of the size of U. The ifort compiler will allocate by default all the temporary arrays on the stack (which has a limited size), whereas the gfortran compiler allocates the large arrays on the heap (which is limited only by the available memory on the machine).

So you have 2 solutions here:

  • increase the stack size. On linux type the command ulimit unlimited to remove any size limit for the stack
  • use the ifort -heap-arrays option, which forces all temporary (and automatic) arrays to be allocated on the heap