I've read the previous comments about ulimit, googled my way around the bash and Fedora archives, and have yet to find a proper solution to what I believe might be stacksize issue.
To clarify:
- 1. Serial BENCHMARK1 runs fine.
2. OPENMP BENCHMARK1 will only run if ntiles>=16 (I have 2 dual-core 64 bit Xeons, 2GB, and 4MB L2 cache). Otherwise it segfaults.
3. MPICH2 BENCHMARK1 will run with ntiles>=12. The only reported error is that one of the threads died.
I'd like to get the OPENMP runs down to one thread per core... current runs aren't even getting 50% efficiency (16 threads on 4 cores only decreases runtime to .6t_serial; I'd ideally hope for .25t_serial). Anyone know of a compiler, kernel or shell option for any of MPICH2, Linux 26, bash, MPICH2 or ifort that will solve this problem

gfortran worked fine with this configuration, though I've been too absorbed in the ifort builds to see which one is faster. I'll try that after Christmas; though its annoying not to have the ifort option open.
