Incorrect nesting of OpenMP directives

Bug reports, work arounds and fixes

Moderators: arango, robertson

Post Reply
Message
Author
User avatar
drews
Posts: 35
Joined: Tue Jun 19, 2007 3:32 pm
Location: National Center for Atmospheric Research
Contact:

Incorrect nesting of OpenMP directives

#1 Unread post by drews »

I get the following run-time errors on NCAR's bluefire supercomputer:

1587-114 Incorrect nesting of OpenMP directives.
1587-114 Incorrect nesting of OpenMP directives.

SVN Revision: 442M. ROMS/TOMS version 3.4. The last entry in the ROMS log file is:
Initial basin volumes: TotVolume = 5.0360477799E+10 m3
MinVolume = -2.2072826138E+05 m3
MaxVolume = 3.3109239207E+05 m3
Max/Min = -1.5000000000E+00

NL ROMS/TOMS: started time-stepping: (Grid: 01 TimeSteps: 00000001 - 00086400)
I am using 4 threads with 2x2 tiling. My build.bash file has:

export FORT=xlf
export USE_OpenMP=on
export USE_LARGE=on

I changed 'make' to 'gmake' in my build.bash. The environment variable OMP_NUM_THREADS=4. The other MP_ env vars are:
be1105en$ env | grep MP_
MP_EUIDEVICE=sn_all
MP_PROCS=1
MP_WAIT_MODE=poll
MP_COREFILE_FORMAT=core_lite
MP_POLLING_INTERVAL=2000000
MP_USE_BULK_XFER=no
MP_EUILIB=us
MP_RC_USE_LMC=yes
MP_EAGER_LIMIT=10240
MP_BULK_MIN_MSG_SIZE=12k
MP_TASKS_PER_NODE=4
MP_INSTANCES=2
Setting MP_PROCS=4 causes the error message to occur only once, but does not fix the problem.

Here are the rules for OpenMP directive nesting:
https://computing.llnl.gov/tutorials/op ... ingNesting

I looked through the built code (especially Build/main2d.f90) and did not see any obvious violations of these rules. The problem goes away when I set my tiling to 1x1 and the environment variable
OMP_NUM_THREADS=1
but of course I want to run the model in parallel because it's faster.

Work-around:

Here is line 132 of ROMS/Nonlinear/main2d.F:

Code: Select all

!-----------------------------------------------------------------------
!  If applicable, process input data: time interpolate between data
!  snapshots.
!-----------------------------------------------------------------------
!
!$OMP PARALLEL DO PRIVATE(thread,subs,tile) SHARED(ng,numthreads)
      DO thread=0,numthreads-1
        subs=NtileX(ng)*NtileE(ng)/numthreads
        DO tile=subs*thread,subs*(thread+1)-1,+1
          CALL set_data (ng, TILE)
        END DO
      END DO
!$OMP END PARALLEL DO
Commenting out those two $OMP directives causes the problem to go away. Removing any other $OMP directives in main2d.F does not affect the problem. So something weird is happening with set_data(), and this is odd, because I don't have any input data. I have topography in a NetCDF file but no forcing files; the wind stress is analytical.

Even with that workaround in place, the ROMS model still runs quite faster than the single-threaded version. I suspect that this is a bug, but perhaps I am not building ROMS correctly for parallel operation?

Carl

User avatar
arango
Site Admin
Posts: 1361
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Incorrect nesting of OpenMP directives

#2 Unread post by arango »

I don't know what it is going on here. All the OMP parallel loop directives in ROMS have the syntax:

Code: Select all

!$OMP PARALLEL LOOP PRIVATE (...) SHARED (...)
   ...
!$OMP END PARALLEL DO
I don't know why are you missing the leading comment symbol ! in the main2d.f90. It is like something is wrong with the C-preprocessing. When the option _OPENMP is not activated, the symbol OMP is replaced by ! in globaldefs.h. Then, the lines starting with !$! are removed by ROMS/Bin/cpp_clean perl script.

Are you combining both shared-memory and distribute-memory compiling option? This cannot be done in ROMS. Your make configuration file (Compilers/*.mk) may be the problem here for this particular computer.

User avatar
drews
Posts: 35
Joined: Tue Jun 19, 2007 3:32 pm
Location: National Center for Atmospheric Research
Contact:

Re: Incorrect nesting of OpenMP directives

#3 Unread post by drews »

I believe that I am using shared memory alone and not distributed memory. Here is the entire USE_ section of build.bash:

Code: Select all

#export           USE_MPI=on
#export        USE_MPIF90=on
 export              FORT=xlf

 export        USE_OpenMP=on

#export         USE_DEBUG=on
 export         USE_LARGE=on
#export       USE_NETCDF4=on
I am not using any customized makefiles:
#export COMPILERS=${MY_ROMS_SRC}/Compilers

I assume the make process takes the name of the operating system from 'uname', so I should be using the standard ROMS makefile for AIX:
5210 Feb 02 14:05 AIX-xlf.mk

Is there something I can check in that makefile? I have not modified it.

Carl

User avatar
drews
Posts: 35
Joined: Tue Jun 19, 2007 3:32 pm
Location: National Center for Atmospheric Research
Contact:

Re: Incorrect nesting of OpenMP directives

#4 Unread post by drews »

More information:

My model domain has one analytical point source of river inflow. ana_psource.h has three !$OMP BARRRIER directives in it (svn $Id: ana_psource.h 429 2009-12-20 17:30:26Z). One of those directives makes it into Build/analytical.f90, near line 174, just before the Qbar values are set:

Code: Select all

!$OMP BARRIER
      IF ((Istr.eq.1).and.(Jstr.eq.1)) THEN
        DO is=1,Nsrc
          Qbar(is)=150.0_r8
        END DO
      END IF
      RETURN
      END SUBROUTINE ana_psource_tile
      SUBROUTINE ana_smflux (ng, tile, model)
This code is called from the subroutine ana_psource.

set_data calls ana_psource. Therefore the BARRIER directive is nested within the PARALLEL DO directives around the set_data loop in main2d.f90 (shown in a previous post). This configuration appears to violate one of the rules for nesting of OMP directives:
BARRIER directives are not permitted in the dynamic extent of DO/for, ORDERED, SECTIONS, SINGLE, MASTER and CRITICAL regions.
When I restore the directives around set_data, and remove the BARRIER directive in ana_psource.h, the nesting problem again goes away.

Can someone advise me as to the wisdom of removing the BARRIER directive from ana_psource.h? Why do the threads need to synchronize there before setting the Qbar values?

Carl

User avatar
arango
Site Admin
Posts: 1361
Joined: Wed Feb 26, 2003 4:41 pm
Location: DMCS, Rutgers University
Contact:

Re: Incorrect nesting of OpenMP directives

#5 Unread post by arango »

I have mentioned several times in this forum that point sources via analytical expressions are extremely difficult in both shared- and distributed-memory applications. I recommend usage of ana_psource.h for very simple serial applications and not parallel applications. Use NetCDF input river runoff forcing file instead... all the time! It is much easier and quick.

:idea: Check following :arrow: post for more details.

Post Reply