OpenMP in FORTRAN does not run expected number of threads - parallel-processing

I am new to parallel programming, and am having trouble getting a simple parallel Fortran program to use multiple threads in OpenMP. The following program:
Program Hello
Use omp_lib
Implicit None
INTEGER nthreads
nthreads = 4
CALL OMP_SET_NUM_THREADS(nthreads)
write(*,*) omp_get_num_procs()
write(*,*) omp_get_max_threads()
write(*,*) omp_get_num_threads()
!$OMP PARALLEL
Write(*,*) 'Hello'
Write(*,*) omp_get_num_threads()
!%OMP END PARALLEL
End Program Hello
Produces the result:
32
4
1
Hello
1
What is the reason that the number of threads inside the parallel region is not the same as nthreads that I set above? I am compiling the program using gfortran -f openmp Hello.f on a Windows machine running cygwin.

I try it compiling in Linux with gfortran. And I get error because the OMP directives. I changed it to:
!$OMP PARALLEL
Write(*,*) 'Hello'
Write(*,*) omp_get_num_threads()
!$OMP END PARALLEL
(Notice !$OMP). And now it works. The output:
$ ./a.out
16
4
1
Hello
4
Hello
4
Hello
4
Hello
4

The sentinel, i.e. !$omp or *$omp or c$omp must appear at the beginning of the line by itself.
It simply launches a single thread otherwise and doesn't complain.
!$OMP PARALLEL
Write(*,*) 'Hello'
Write(*,*) omp_get_num_threads()
!$OMP END PARALLEL

I don't know if it's the issue or not, but the last directive in the OP's code has a % instead of a $. May be just a typo, but I had recently posted code that a silly typo like that caused me trouble.

Related

Run only part of fortran coarray code parallel rest serial

I have a main program that has subroutine written using coarrays. The problem is that while running the code it treats the entire code (main + subroutine) as parallel code and runs it on all specified processors. For eg., the Below program prints "Hello from main" four times when running using 4 CPUs. I want that the main program runs on one CPU while when it encounters a subroutine that uses coarrays it runs on all specified CPUs.
Main program that call subroutine coarray_test
program main
implicit none
write (*,*) "Hello from main "
call coarray_test
end program
Subroutine coarray_test
subroutine coarray_test
implicit none
write (*,*) "Hello from Subroutine coarray_test "
return
end subroutine coarray_test
Command used for compile and execution
export FOR_COARRAY_NUM_IMAGES=4
ifort -g -coarray -o test.out main.f90 coarray_test.f90
./test.out
Output
Hello from main
Hello from Subroutine coarray_test
Hello from main
Hello from Subroutine coarray_test
Hello from main
Hello from Subroutine coarray_test
Hello from main
Hello from Subroutine coarray_test
Expected Output
Hello from main
Hello from Subroutine coarray_test
Hello from Subroutine coarray_test
Hello from Subroutine coarray_test
Hello from Subroutine coarray_test
You are misunderstanding the CAF model. It is based, like MPI, on independent processes. So the whole code is executed by each process. If you want some part executed byonly one process you can say if (image==0).
But think:
if only one process computes something, how are the other ones going to get that result?
and what do you gain from only one process being active? The other ones are idle while they could be doing useful work.
Really, the bits that you want executed only once, probably need to be done redundantly on all processes. With the exeception of print statements and other I/O.

How to measure elapsed time on a processor executing code in a Fortran OpenMP thread

I used cpu_time, but apparently that gives the total time for all threads. I used omp_get_wtime, but get an output in the negatives which is not correct, and also mpi_wtime for which I am now getting a core dump (and for which earlier I was getting just 0.000000000). The relevant code is as follows:
real*8 tbeg, tend
....
!$omp sections private (ie, tbeg, tend)
!$omp section
tbeg = omp_get_wtime()
do ie=1, E
call rmul(u, A, B, dudr, duds, dudt, ie)
enddo
tend = omp_get_wtime()
!Step 4: Print results
print *, tend-tbeg
!$omp end section
!$omp section
....
!$omp end section
!$omp end sections
My compile option is:
gfortran -Ofast -c mult.f -o mult.o -mcmodel=large -I/usr/lib/openmpi/include -fopenmp
gfortran -o baseline ../lib/performance_test.o mult.o ../lib/rose.o -lcuda -lcudart -L/usr/local/cuda-5.0/lib64 -lcublas -lgomp -lmpi_f77
I finally managed to reproduce your issue (with some difficulties, but I've got it). And I'm pretty sure that the bottom line is that you forgot two things in your code:
To include either the OpenMP header include 'omp_lib.h' or better the OpenMP module use omp_lib
To forbid implicit variables declaration implicit none
Although the latter isn't strictly speaking an error, it's definitely a good habit to take and which would have spare you the actual issue coming from the former, since you would have had the following message from the compiler:
tbeg = omp_get_wtime()
1 Error: Function 'omp_get_wtime' at (1) has no IMPLICIT type
So what happened is that you implicitly declared omp_get_wtime as a function returning a single precision floating point variable whereas it actually returns a double precision one. So the return value was truncated and you were having garbage.
Just add the right header and use omp_get_wtime() as you have in you code snippet, and everything should be all-right.

Gfortran exhibits a weird behaviour, is this a bug?

I noticed a weird behaviour with gfortran, the version i am using is
GNU Fortran (MacPorts gcc5 5.2.0_0) 5.2.0
my os is OS X YOSEMITE 10.10.3 (14D136)
i run the following code
program test
implicit none
type :: mytype
real(kind=8),dimension(:,:,:),allocatable :: f
end type
type(mytype),dimension(:,:),allocatable :: tab
integer i,j
allocate(tab(3,8))
do i=1,3
do j=1,8
allocate(tab(i,j)%f(i,i,i))
enddo
enddo
call check_shapes(tab(:,1))
contains
subroutine check_shapes(arg)
integer :: n,k
type(mytype),dimension(:) :: arg
n=size(arg)
do k=1,n
print*,shape(arg(k)%f)
enddo
end subroutine
end program
The output is as expected
1 1 1
2 2 2
3 3 3
however, change the way i define dummy arguments in the subroutine
type(mytype),dimension(:) :: arg
to
class(mytype),dimension(:) :: arg
introducing a class instead of type for the dummy argument, i have the following output
2 2 2
3 3 3
1 1 1
Is this a bug? or i am missing something?
note that it works fine with ifort
version Intel(R) 64, Version 15.0.3.187 Build 20150408
I have checked the already reported bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61337
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58043
and both of them are (almost completely) fixed on the GCC trunk by a recent commit (probably https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58043 ). Your bug appears to be just a variant of these reports.
I have added the information about the recent change to the existing reports. You can expect GCC 6 to contain the fix.

gfortran compiling error for openmp on windows

I am trying to run a very simple openmp fortran 90 code, using windows 7 and gfortran. Here is my code
PROGRAM HELLO
! USE omp_lib
IMPLICIT NONE
INTEGER OMP_GET_MAX_THREADS
INTEGER OMP_GET_NUM_THREADS
INTEGER OMP_GET_THREAD_NUM
write(6,"(a, i3)") " OpenMP max threads: ", OMP_GET_MAX_THREADS()
!$OMP PARALLEL
write(6,"(2(a,i3))") " OpenMP: N_threads = ", &
& OMP_GET_NUM_THREADS()," thread = ", OMP_GET_THREAD_NUM()
!$OMP END PARALLEL
END PROGRAM
And this is what I'm compiling with
gfortran -fopenmp -g -J"bin" test.f90 -o test
And this is what happens:
gfortran -fopenmp -g -J"bin" test.f90 -o test
gfortran.exe: error: libgomp.spec: No such file or directory
I believe I haven't set up my environment variables correctly, and I wanted to know if there is a way I can do this programmatically, or if anyone knows how they are supposed to be set up, assuming that is the problem. Any help is greatly appreciated.

Confusing debugging error in Fortran program

I've been sitting here for a while quite baffled as to why my debugger keeps displaying an error in my code when the program runs fine. There are three parts to a very simple program that is just reading in information from a file.
My code is broken into three Fortran files given below and compiled via
ifort -o test global.f90 read.f90 test.f90
global.f90:
module global
implicit none
integer(4), parameter :: jsz = 904
end module global
read.f90:
subroutine read(kp,q,wt,swt)
implicit none
integer(4) :: i, j
integer(4), intent(in) :: kp
real(8), intent(out) :: swt, q(kp,3), wt(kp)
swt = 0.0d0; q(:,:) = 0.0d0; wt(:) = 0.0d0
open(7,file='test.dat')
read(7,*) ! Skipping a line
do i = 1, kp
read(7,1000)(q(i,j),j=1,3), wt(i)
swt = swt + wt(i)
end do
close(7)
return
1000 format(3F10.6,1X,1F10.6)
end subroutine read
test.f90:
program test
use global
integer(4) :: i, j
real(8) :: tot, qq(jsz,3), wts(jsz)
call read(jsz,qq,wts,tot)
stop
end program test
The error I keep receiving is
Breakpoint 1, read (kp=904,
q=<error reading variable: Cannot access memory at address 0x69bb80>,
wt=..., swt=6.9531436082559572e-310) at read.f90:6
This error appears right when the subroutine of read is called. In other words, I'm adding a breakpoint at the read subroutine and running the code in gdb after the breakpoint is added. The program will continue to run as expected and give the correct outputs when I include write statements in the 'test' program. However, if I use the gdb print options I receive an error of 'Cannot access memory at address 0x69bb80' for array q only. All other arrays and variables can be displayed with no problems.
As I would like the read subroutine to be a stand alone subroutine and not necessarily use any global parameters, I have not used the global module and instead called the variable kp into the subroutine. I decided to test whether using the global module would help, and if I use jsz in place of kp, I do indeed remove the error. However, since this isn't my overall goal with the subroutine, I would hopefully like to figure out how to fix this without the use of the global module. (I also tried not using the global at all and setting the parameter variable of kp in the test.f90 program directly, but this also gives the error.)
Any insight on possible reasons for this error, or suggestions to try and fix the memory addressing issue would be greatly appreciated.
I think this is an issue specific to the ifort+gdb combination that is fixed with newer gdb versions. Here's a smaller example to reproduce the issue:
$ cat test.f90
subroutine bar(arg)
integer, intent(inout):: arg
print *, 'bar argument is', arg
arg = 42
end subroutine bar
program test
integer:: param
param = 3
call bar(param)
print *, 'post-bar param:', param
end program test
$ ifort -g -O0 -o test test.f90
$ gdb --quiet test
Reading symbols from /home/nrath/tmp/test...done.
(gdb) b 4
Breakpoint 1 at 0x402bd0: file test.f90, line 4.
(gdb) r
Starting program: /home/nrath/tmp/test
[Thread debugging using libthread_db enabled]
Breakpoint 1, bar (arg=#0x2aaa00000003) at test.f90:4
4 print *, 'bar argument is', arg
(gdb) p arg
$1 = (REF TO -> ( INTEGER(4) )) #0x2aaa00000003: <error reading variable>
(gdb) quit
$ gdb --version | head -1
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
However, if you compile with gfortran instead of ifort, or if you use GDB 7.7.1, it works fine.
Did you add the INTERFACE statement to the end of your programme?
You need it when you call a function that is not contained in the programme.

Resources