Although NetCDF4/PARALLEL_IO has been available for a long time, I just finished the compilation recently.
And now I still have some problems when I ran it.
I took the UPWELLING case as the test, and I compiled it to two versions.
One was compiled with NetCDF4/HDF5 libraries by turn on PARALLEL_IO, and the other was just without PARALLEL_IO.
Both of them were compiled successfully.
After I ran the two versions of UPWELLING cases, I got some problems very confused.
(I set the Lm, Mm in the ocean_upwelling.in file to 400 and 800)
Firstly, the two versions take different time to finish the job. PARALLEL_IO version is 2 hours and 36 minutes,
while NO_PARALLEL_IO version is only 18 minutes.
Secondly, the size of output NC files are very different. PARALLEL_IO version history file is 999M, while
NO_PARALLEL_IO version is 2.6G. And as well as the avg/dia files.
Thirdly, from the output log file, I can see it report PARALLEL_IO version has two more CPP flags than NO_PARALLEL_IO
version: NETCDF4 and PARALLEL_IO; but from the file header of the output NC files, it shows PARALLEL_IO version
has different additional CPP flags: NETCDF4 and PERFECT_RESTART.
BTW, the compiler is intel fortran, MPI is openmpi-intel.
problems on NETCDF4/PARALLEL_IO
Re: problems on NETCDF4/PARALLEL_IO
Can you restart from the restart file? PERFECT_RESTART could explain the different file sizes you see - you need to save more to have it. Did you ask for PERFECT_RESTART?
Do the output files look reasonable? How many processes did you run with?
Do the output files look reasonable? How many processes did you run with?
Re: problems on NETCDF4/PARALLEL_IO
I didn't try to restart it yet, I'll do that ASAP and post the result here.
I didn't define PERFECT_RESTART in either case.
I applied NtileI,NtileJ = 8 and 16, total is 128 processors.
The output result looks reasonable.
If PERFECT_RESTART can explain the file size of ocean_rst.nc, can it explain the ocean_his.nc file.
I hope PARALLEL_IO can speed up the computing when I applied big problem (say 1000x1000x40 or more),
but the test didn't give me the result I want. PARALLEL_IO seems more slower.
I didn't define PERFECT_RESTART in either case.
I applied NtileI,NtileJ = 8 and 16, total is 128 processors.
The output result looks reasonable.
If PERFECT_RESTART can explain the file size of ocean_rst.nc, can it explain the ocean_his.nc file.
I hope PARALLEL_IO can speed up the computing when I applied big problem (say 1000x1000x40 or more),
but the test didn't give me the result I want. PARALLEL_IO seems more slower.
- arango
- Site Admin
- Posts: 1364
- Joined: Wed Feb 26, 2003 4:41 pm
- Location: DMCS, Rutgers University
- Contact:
Re: problems on NETCDF4/PARALLEL_IO
Make sure that you use a recent version of the code. Last May, I made some corrections to improve parallel I/O efficiency, see ticket.
Parallel I/O needs special computer architecture and communications. If the access to writing data into the disk is via network cables, the serial I/O is usually more efficient. See my simple tests in the above ticket. As you increase the number of nodes, you will be penalized substantially for the communications involving parallel I/O. In my opinion, parallel I/O is not for cluster computers that write data frequently to an external disk.
Parallel I/O needs special computer architecture and communications. If the access to writing data into the disk is via network cables, the serial I/O is usually more efficient. See my simple tests in the above ticket. As you increase the number of nodes, you will be penalized substantially for the communications involving parallel I/O. In my opinion, parallel I/O is not for cluster computers that write data frequently to an external disk.