My model setup hangs right after initialization without giving a very usable output in order to debug it.
The application is on an HPC infrastructure, and I am using the gnu compiler and a serial setup in order to be able to use the ROMS debugger.
The message I get from the machine (slurm error file) and the log file, are respectively:
Code: Select all
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x2B317B308347
#1 0x2B317B30895E
#2 0x2B317BD2050F
#3 0x40A550 in inp_par_
#4 0x403A73 in __ocean_control_mod_MOD_roms_initialize
#5 0x402F98 in MAIN__ at master.f90:0
srun: error: node190: task 0: Segmentation fault
srun: Terminating job step 456240.0
Code: Select all
Model Input Parameters: ROMS/TOMS version 3.7
Tuesday - January 23, 2018 - 6:20:09 PM
-----------------------------------------------------------------------------
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
Operating system : Linux
CPU/hardware : x86_64
Compiler system : gfortran
Compiler command : /apps/compilers/gnu/4.9.2/bin/gfortran
Compiler flags : -frepack-arrays -g -fbounds-check -ffree-form -ffree-line-length-none -ffree-form -ffree-line-length-none -ffree-form -ffree-line-length-none
SVN Root URL : ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^$
SVN Revision : ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
Local Root : ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^$
Header Dir : ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^$
Header file : ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^$
Analytical Dir: ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^$
Code: Select all
forrtl: severe (408): fort: (2): Subscript #1 of the array GRIDNUMBER has value 1 which is greater than the upper bound of 0
Image PC Routine Line Source
libnetcdff.so.6 00002B5FA9DBA773 Unknown Unknown Unknown
oceanG 0000000002104AF2 ntimesteps_ 80 ntimestep.f90
oceanG 0000000000740998 main3d_ 82 main3d.f90
oceanG 0000000000404DB8 ocean_control_mod 160 ocean_control.f90
oceanG 0000000000403320 MAIN__ 86 master.f90
oceanG 0000000000402C9E Unknown Unknown Unknown
libc.so.6 00002B5FAC8FAD1D Unknown Unknown Unknown
oceanG 0000000000402BA9 Unknown Unknown Unknown
srun: error: node204: task 0: Exited with exit code 152
srun: Terminating job step 455883.0
Facts:
1) The model runs on my local computer, but not on the HPC infrastructure.
2) Another ROMS setup I have, runs on the HPC infrastructure.
3) I have tried running the model using only analytical headers for forcing and initial conditions, and closed boundaries with no success.
From 1) and 2) I can conclude that there is no problem in the setup of my model AND there is no problem with the setup of ROMS in the HPC infrastructure.
From 3) I can conclude that it is not the input files causing the problem.
I am out of ideas and in great need of help..
Thank you.