Code stuck when solving Poisson equation with MPI - parallel-processing

I implemented a simple 1D Poisson equation parallel solver with MPI so familiarize myself with the MPI library. I designed the code to run with an unspecified number of processors (including with just 1).
The code runs and yields good results when ran on 1 or 2 processors. However, it gets stuck on the mpi_send and mpi_recv calls with 4 processors. Therefore, I expect my implementation of the exchange of ghost points is wrong.
As the code is too large to include here, I've only included the Jacobi scheme and the exchange of data:
do iter=1,max_iter
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
! Initial guess, on interior points only
Ujacob(min_x+1:max_x-1) = 0._dp
Ujacob_all(0:grid_nx-1) = 0._dp
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
! Store solution vector from last iteration
Uold (:) = Ujacob (:)
Uold_all(:) = Ujacob_all(:)
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
! Jacobi scheme
do ii=min_x+1,max_x-1
!Ujacob(ii) = 0.5_dp * (Uold (ii-1) + Uold (ii+1) - grid_delta_x**2 * Urhs(ii))
end do
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
! Gather Ujacob vector
call mpi_allgather(Ujacob(0:proc_nx-1), proc_nx, mpi_float, &
& Ujacob_all, proc_nx, mpi_float, mpi_comm_world, ierror)
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
! Compute error and check if less than tolerance value
error = sqrt((sum(Ujacob_all - Uold_all)**2) / dble(grid_nx))
if(error < error_tol) return
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
! Exchange data points
! Interior processors
if(Xsrc /= -1 .AND. Xdes /= -1) then
call mpi_send(Ujacob( 0), 1, mpi_float, Xsrc, 200, mpi_comm_world, ierror)
call mpi_send(Ujacob(proc_nx-1), 1, mpi_float, Xdes, 100, mpi_comm_world, ierror)
call mpi_recv(Ujacob( -1), 1, mpi_float, Xsrc, 100, mpi_comm_world, stat, ierror)
call mpi_recv(Ujacob(proc_nx), 1, mpi_float, Xdes, 200, mpi_comm_world, stat, ierror)
! First processor
elseif(Xsrc == -1) then
call mpi_send(Ujacob(proc_nx-1), 1, mpi_float, Xdes, 100, mpi_comm_world, ierror)
call mpi_recv(Ujacob(proc_nx ), 1, mpi_float, Xdes, 200, mpi_comm_world, stat, ierror)
! Last processor
elseif(Xdes == -1) then
call mpi_send(Ujacob( 0), 1, mpi_float, Xsrc, 200, mpi_comm_world, ierror)
call mpi_recv(Ujacob(-1), 1, mpi_float, Xsrc, 100, mpi_comm_world, stat, ierror)
end if
end do
Xsrc and Xdes are set the following way:
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
! Setting the source and destination neighbors of each processor
if(myid == 0) then
Xsrc = -1
Xdes = myid + 1
elseif(myid == nprocs-1) then
Xsrc = myid -1
Xdes = -1
else
Xsrc = myid - 1
Xsrc = myid + 1
end if
Also, I have checked that the processor rank 0 and nprocs-1 indeed correspond to the left and right bounded processors.
I have checked that the tags are well set. Also, feel free to comment on anything which you feel may be improved.

#Hristo is correct that your code is conceptually flawed in principle. However, almost every MPI implementation will buffer MPI_Send for a message containing a single real (though of course this is not guaranteed) so this is not the issue with your code.
I think you have mismatched your tags - the edge cases should have the tags reversed:
elseif(Xsrc == -1) then
call mpi_send(Ujacob(proc_nx-1), 1, mpi_float, Xdes, 200, mpi_comm_world, ierror)
call mpi_recv(Ujacob(proc_nx ), 1, mpi_float, Xdes, 100, mpi_comm_world, stat, ierror)
! Last processor
elseif(Xdes == -1) then
call mpi_send(Ujacob( 0), 1, mpi_float, Xsrc, 100, mpi_comm_world, ierror)
call mpi_recv(Ujacob(-1), 1, mpi_float, Xsrc, 200, mpi_comm_world, stat, ierror)
end if
A few other comments on the code:
it is very inefficient to compute the error term with allgather: you should sum up over the local elements only and then compute the global error with MPI_Allreduce;
you should use MPI_REAL not MPI_FLOAT for a Fortran code;
I do not see how our code can run on a single process - here the process will execute the first elseif clause then try and send to a non-existent rank.
Once you have checked that your tags are now correct, you should then fix the issues pointed out by #Hristo.

Related

Segfault when passing data using MPI Windows (MPI_Put)

I am trying to figure out how to use MPI to work with matrices.
I have a 3x6 matrix filled with zeros and am running code with 3 threads. 0 is the main one, 1 writes to the first row of the matrix in columns from 1 to 3 ones, and 2 stream writes to the second row in columns 4-6 of two.
I pass these formed parts to the main thread (at 0), I get the correct result, but after that a memory error is output to the console.
I can't figure out what I'm doing wrong. Can you please tell me what is my mistake?
program test
Use mpi
Implicit None
integer :: process_Rank, size_Of_Cluster, ierror = 0, win, size_s, n = 6
integer:: i , j
integer:: start, target_count = 9
integer :: mtx(3,6)
integer(kind = MPI_ADDRESS_KIND) :: nbytes = 4
!input matrix
do i = 1,3
do j =1,6
mtx(i,j) = 0
end do
end do
Call mpi_sizeof( mtx, size_s, ierror ) !Get the size of a matrix element
call MPI_INIT(ierror)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size_Of_Cluster, ierror)
call MPI_COMM_RANK(MPI_COMM_WORLD, process_Rank, ierror)
!create windows
if(process_Rank == 0) then
call MPI_WIN_CREATE(mtx, size_s *6 * 3 * nbytes, 1, MPI_INFO_NULL, MPI_COMM_WORLD, win, ierror)
else
call MPI_WIN_CREATE(mtx, size_s * 6* 3*nbytes,1, MPI_INFO_NULL, MPI_COMM_WORLD, win, ierror)
end if
CALL MPI_Win_fence(0,win,ierror)
if(process_Rank == 1) then
!fill 3 columns of the first row with ones
start = 0
do i = 0,3
mtx(process_Rank,i+start) = process_Rank
end do
CALL MPI_PUT(mtx, size_s*3*6, MPI_INTEGER, 0, start * nbytes, target_count, MPI_INTEGER, win, ierror)
!print mtx
print *, process_Rank, ' put = '
do i = 1,3
print *, ''
do j = 1,3
write(*,fmt='(g0)', advance = 'no') mtx(i,j)
write(*,fmt='(g0)', advance = 'no') ' '
end do
end do
end if
CALL MPI_Win_fence(0, win,ierror)
if(process_Rank == 2) then
!fill the last 3 columns of the second row with twos
start = 3
do i = 1,3
mtx(process_Rank,i+start) = process_Rank
end do
CALL MPI_PUT(mtx(1:3,4:6), size_s* 3 *6, MPI_INTEGER, 0, 3 * 3 * nbytes, target_count, MPI_INTEGER, win, ierror)
!print mtx
print *, process_Rank, ' put = '
do i = 1,3
print *, ''
do j = 4,6
write(*,fmt='(g0)', advance = 'no') mtx(i,j)
write(*,fmt='(g0)', advance = 'no') ' '
end do
end do
end if
CALL MPI_Win_fence(0, win,ierror)
! print result
if(process_Rank == 0) then
print *, 'result = '
do i = 1,3
print *, ''
do j = 1,6
write(*,fmt='(g0)', advance = 'no') mtx(i,j)
write(*,fmt='(g0)', advance = 'no') ' '
end do
end do
end if
CALL MPI_Win_fence(0, win,ierror)
CALL MPI_WIN_FREE(win, ierror)
call MPI_FINALIZE(ierror)
end program test
Console:
1 put =
1 1 1
0 0 0
0 0 0
2 put =
0 0 0
2 2 2
0 0 0
result =
1 1 1 0 0 0
0 0 0 2 2 2
0 0 0 0 0 0
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x7fd4447bcd01 in ???
#1 0x7fd4447bbed5 in ???
#2 0x7fd4445f020f in ???
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node alm-VirtualBox exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
If you use the -fcheck=all, which Ian Bush suggested to you in the first comment under your question, you will get the reason for the error immediately and you do not have to wait many hours for feedback on the internet. I got:
At line 38 of file mpi_wins.f90 Fortran runtime error:
Index '0' of dimension 2 of array 'mtx' below lower bound of 1
Error termination. Backtrace:
#0 0x7f7ed3e75640 in ???
#1 0x7f7ed3e76185 in ???
#2 0x7f7ed3e7652a in ???
#3 0x4010e4 in test
at /home/lada/f/testy/stackoverflow/mpi_wins.f90:38
#4 0x401e78 in main
at /home/lada/f/testy/stackoverflow/mpi_wins.f90:3
You are indexing your mtx array using the process rank, but the array is defined to start from 1.
integer :: mtx(3,6)
However, MPI ranks start from 0, not from 1.
Also notice that the backtrace now contains a better code location thanks to the -g compiler option.

How to pass information to the main process using MPI PUT?

I want to pass information to the main process using MPI PUT, but I get an error that the process exits without finalize.
For example, the error for two processes looks like this:
job aborted:
[ranks] message
[0] terminated
[1] process exited without calling finalize
How I can put my value 's' to main process? Can I do it with function MPI ACCUMULATE?
I want to get the modified s variable in the main process. What would the code look like if I wanted to calculate the sum of all received variables from all processes?
integer :: process_Rank, size_Of_Cluster, ierror = 0, win, s = 1
call MPI_INIT(ierror)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size_Of_Cluster, ierror)
call MPI_COMM_RANK(MPI_COMM_WORLD, process_Rank, ierror)
!create windows
if(process_Rank == 0) then
call MPI_WIN_CREATE(s, sizeof(s), 1, MPI_INFO_NULL, MPI_COMM_WORLD, win, ierror)
else
call MPI_WIN_CREATE(0, 0, 1, MPI_INFO_NULL, MPI_COMM_WORLD, win, ierror)
end if
print *, process_Rank, ' create window'
CALL MPI_Win_fence(0,win,ierror)
!get s from main process (rank = 0)
if(process_Rank <> 0) then
CALL MPI_Get(s, sizeof(s) , MPI_INT, 0, 0, 20, MPI_INT, win, ierror)
print *, process_Rank, ' get data and s = ', s
end if
CALL MPI_Win_fence(0,win,ierror)
if(process_Rank <> 0) then
s = s + process_Rank
print *, process_Rank, ' s = ', s
CALL MPI_PUT(s, sizeof(s), MPI_INT, 0 , 1 , 1, MPI_INT, win, ierror)
end if
print *, 'result s = ', s
CALL MPI_Win_fence(0, win,ierror)
CALL MPI_WIN_FREE(win, ierror)
call MPI_FINALIZE(ierror)
I'm sorry, but there was a lot wrong with your code and I don't have time to explain it all now, but below is a very quickly hacked together version that I think works - the main errors were incorrect kinds for actual arguments of MPI routines, and use of non-standard features (sizeof, <> [does that really work?], MPI_INT, maybe others). Please study it and try to work out why it works and yours does not, I shall try to come back and put in a longer explanation at some point. But if you learn nothing else stop using Include 'mpif.h' and start using the module as I have - this immediately caught one of your more serious errors, namely the wrong kind of integers for the displacement arguments in a number of calls.
If you are learning I would also recommend against the Portland Group compiler. My experiences with it over the years have not been good. Try and use a recent version of gfortran or the Intel or NAG compilers instead.
Program onesided
Use mpi
Implicit None
integer :: process_Rank, size_Of_Cluster, ierror = 0, win, s = 1, size_s
Call mpi_sizeof( s, size_s, ierror )
call MPI_INIT(ierror)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size_Of_Cluster, ierror)
call MPI_COMM_RANK(MPI_COMM_WORLD, process_Rank, ierror)
!create windows
if(process_Rank == 0) then
call MPI_WIN_CREATE(s, Int( size_s, mpi_address_kind ), 1, MPI_INFO_NULL, MPI_COMM_WORLD, win, ierror)
else
call MPI_WIN_CREATE(0, 0_mpi_address_kind, 1, MPI_INFO_NULL, MPI_COMM_WORLD, win, ierror)
end if
print *, process_Rank, ' create window'
CALL MPI_Win_fence(0,win,ierror)
!get s from main process (rank = 0)
if(process_Rank /= 0) then
CALL MPI_Get(s, 1, MPI_INTEGER, &
0, 0_mpi_address_kind, 1, MPI_INTEGER, win, ierror)
print *, process_Rank, ' get data and s = ', s
end if
CALL MPI_Win_fence(0,win,ierror)
if(process_Rank /= 0) then
s = s + process_Rank
print *, process_Rank, ' s = ', s
CALL MPI_PUT(s, 1, MPI_INTEGER, 0 , 0_mpi_address_kind, 1, MPI_INTEGER, win, ierror)
end if
print *, 'result s = ', s
CALL MPI_Win_fence(0, win,ierror)
CALL MPI_WIN_FREE(win, ierror)
call MPI_FINALIZE(ierror)
End Program onesided
ijb#ijb-Latitude-5410:~/work/stack$ mpif90 --version
GNU Fortran (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
ijb#ijb-Latitude-5410:~/work/stack$ mpif90 -Wall -Wextra -fcheck=all -O -g -std=f2018 one_side.f90
ijb#ijb-Latitude-5410:~/work/stack$ mpirun -np 2 ./a.out
0 create window
result s = 1
1 create window
1 get data and s = 1
1 s = 2
result s = 2

Problem in splitting array with MPI_ISEND with fortran

I wrote the following lines of code with the aim to divide the 1D array Jp (which lives on the master process only) over different processes (included the master one).
Each process have to receive a block of non-contigouos modified data (the values are changed in the inner loop) and I created a new format newF with the MPI_TYPE_INDEXED function to select the correct portion of data to send.
I used both MPI_RECV or MPI_IRECV to receive the data.
The problem is that this part of code works fine, with any numbers of tasks (from 1 to 8), until the number of element of Jp is small, when I increase such number (i.e. n = 5000), not all the processes receive the data and the splitted array JpS show the values I used to initialized it (i.e. -10000).
The lines commented show all the changes done in order to solve this problem, Does anyone have an idea?
program test_send
use mpi
implicit none
integer :: rank, nproc, mpi_stat
integer :: n, m, k, io, i, j
integer, allocatable :: Jp(:), JpS(:), JpAux(:)
integer :: count, n_distro,newF
integer, allocatable :: sendcounts(:), displ(:), &
blocklens(:), blockdisp(:), &
request(:)
integer :: ARRAY_OF_STATUS(MPI_STATUS_SIZE), error
data count /3/
call mpi_init(mpi_stat)
call mpi_comm_size(mpi_comm_world, nproc, mpi_stat)
call mpi_comm_rank(mpi_comm_world, rank, mpi_stat)
n = 400*count
allocate(sendcounts(nproc), displ(nproc), &
blocklens(count), blockdisp(count), request(nproc))
if (rank.eq.0) then
allocate(Jp(n+1),JpAux(n+1))
Jp = 0
do i = 1,n+1
Jp(i) = i
enddo
endif
call mpi_barrier(mpi_comm_world, mpi_stat)
m = n/count
n_distro = (m+1)/nproc
k = 0
do i = 1,nproc
if (i<nproc) then
sendcounts(i) = n_distro
else
sendcounts(i) = m - (nproc-1)*n_distro
endif
displ(i) = k
k = k + sendcounts(i)
enddo
call mpi_barrier(mpi_comm_world, mpi_stat)
allocate(JpS(count*sendcounts(rank+1)+1))
call mpi_barrier(mpi_comm_world, mpi_stat)
! call mpi_irecv(JpS, (sendcounts(rank+1))*count+1,mpi_int,0,0,mpi_comm_world, request(rank+1), mpi_stat)
! call mpi_recv(JpS, (sendcounts(rank+1))*count+1,mpi_int,0,0,mpi_comm_world, MPI_STATUS_IGNORE,mpi_stat)
!call mpi_waitall(1,request,ARRAY_OF_STATUS,error)
! call mpi_barrier(mpi_comm_world, mpi_stat)
if (rank.eq.0) then
do i = 0,nproc-1
JpAux = -100000
blocklens = spread(sendcounts(i+1),1,count)
blockdisp = spread(displ(i+1),1,count) + (/ (k*m, k=0,count-1) /)
blocklens(count) = blocklens(count)+1
do j = 1,count
if (j.eq.1) then
JpAux(blockdisp(j)+1:blockdisp(j)+blocklens(j)) = Jp(blockdisp(j)+1:blockdisp(j)+blocklens(j))&
-Jp(blockdisp(j)+1)
else
JpAux(blockdisp(j)+1:blockdisp(j)+blocklens(j)) = Jp( blockdisp(j) + 1 : blockdisp(j) + blocklens(j) )&
-Jp( blockdisp(j)+1 ) + JpAux( blockdisp(j-1) + blocklens(j-1))&
+(Jp( blockdisp(j-1)+blocklens(j-1)+1 )-Jp( blockdisp(j-1)+blocklens(j-1)))
endif
enddo
call mpi_type_indexed(count, blocklens, blockdisp, mpi_int, newF, mpi_stat)
call mpi_type_commit(newF, mpi_stat)
call mpi_isend(JpAux, 1, newF, i, i, mpi_comm_world, request(i+1), mpi_stat)
call mpi_type_free(newF, mpi_stat)
enddo
endif
! call mpi_wait(request(rank+1), ARRAY_OF_STATUS, mpi_stat)
call mpi_barrier(mpi_comm_world, mpi_stat)
!call mpi_waitall(1,request,ARRAY_OF_STATUS,error)
call mpi_recv(JpS, (sendcounts(rank+1))*count+1,mpi_int,0,MPI_ANY_TAG,mpi_comm_world, MPI_STATUS_IGNORE,mpi_stat)
! print*, request
print*, 'rank: ', rank, ', size: ', size(JpS), ', Jp: ', JpS
call mpi_barrier(mpi_comm_world, mpi_stat)
call mpi_finalize(mpi_stat)
end program test_send

Problem with using MPI_SENDRECV with 4 threads

As a minimal problem, I'm trying to send an integer between 4 processors: 0 -> 3 (rank 0 sends to and receives from rank 3), 2 -> 1, 1 -> 2, 3 -> 0. It never finishes execution and hangs, probably waiting for the response from other threads.
I'm compiling the code with mpif90 ... and running with mpiexec -np 4 .... Below is the minimal snippet:
program sendrecv
implicit none
include "mpif.h"
integer :: foo, bar
integer :: mpi_rank, mpi_size, ierr
integer :: mpi_sendto, mpi_recvfrom
integer :: istat(MPI_STATUS_SIZE), status, i
call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, mpi_size, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, mpi_rank, ierr)
print *, "SENDING..."
if (mpi_rank .eq. 0) then
mpi_sendto = 3; mpi_recvfrom = 3
else if (mpi_rank .eq. 1) then
mpi_sendto = 2; mpi_recvfrom = 2
else if (mpi_rank .eq. 2) then
mpi_sendto = 1; mpi_recvfrom = 1
else
mpi_sendto = 0; mpi_recvfrom = 0
end if
foo = mpi_rank
do i = 1, 5
foo = mpi_rank
call MPI_SENDRECV(foo, 1,&
& MPI_INTEGER, mpi_sendto, mpi_rank * 10 + i,&
& bar, 1,&
& MPI_INTEGER, mpi_recvfrom, mpi_rank * 10 + i,&
& MPI_COMM_WORLD, istat, ierr)
end do
print *, "...DONE"
call MPI_FINALIZE(ierr)
end
I don't really understand why this program hangs, maybe I'm missing something or doing something really wrong. If I understand correctly, MPI_SENDRECV is just non-blocking send and recv with two wait-s. In that case, say, if rank=0 sends to rank=3 it shouldn't have any problem receiving from it, right?
I tried sending/receiving from different threads, i.e., doing this:
if (mpi_rank .eq. 0) then
mpi_sendto = 1; mpi_recvfrom = 3
else if (mpi_rank .eq. 1) then
mpi_sendto = 2; mpi_recvfrom = 0
else if (mpi_rank .eq. 2) then
mpi_sendto = 3; mpi_recvfrom = 1
else
mpi_sendto = 0; mpi_recvfrom = 2
end if
still not working.
UPD As it was pointed out, tags should be the same when doing SENDRECV, however In case when doing this call within a loop, similar tags don't help much (see modified code). Old version:
call MPI_SENDRECV(foo, 1,&
& MPI_INTEGER, mpi_sendto, 200,&
& bar, 1,&
& MPI_INTEGER, mpi_recvfrom, 100,&
& MPI_COMM_WORLD, status, ierr)
UPD#2 Actually, if anyone is interested, I found a discussion exactly about the problem I have on why SENDRECV-s may deadlock sometimes.
The term "thread" is misleading here, you should talk about MPI task or MPI process (both are equivalent).
The root cause is a tag mismatch. You send with tag 200 but receive with tag 100.
Also, you should use istat instead of status as the status argument of MPI_Sendrecv().
Here is how you can fix your program
call MPI_SENDRECV(foo, 1,&
& MPI_INTEGER, mpi_sendto, 200,&
& bar, 1,&
& MPI_INTEGER, mpi_recvfrom, 200,&
& MPI_COMM_WORLD, istat, ierr)

Fortran MPI doesn't run on all of the given number of Processors

I'm currently running a program in which a model grid must be processed. When I want to run the program using eg. 10 processors as workers (mpirun -np 11 -machinefile host civil_mpi.exe), only 3 peocessors run the program and the rest stop at the beginning of the program without any error!
If I decrease the size of the model grid, everything works correctly. The total RAM of the machine is over 30 GB, and the size of the Memory needed for each process (based on the model grid size) is less than 1 GB, so theoretically there should be no Problem with the RAM. Could anyone help me on this case?
The OS is Linux OpenSuse, and I'm running the MPI on a machine with 16 Dual-core CPUs. The code is:
call MPI_INIT(ierror)
call mpi_comm_rank(MPI_COMM_WORLD, procid, ierror)
call mpi_comm_size(MPI_COMM_WORLD, nproc, ierror)
nworker = nproc - 1
call mpi_get_processor_name (procname, len, ierror)
n_slice = 280
ny0(1) = 1
ny(1) = 2
do i = 2,n_slice
ny0(i) = ny0(i-1) + 2
ny(i) = ny(i-1) + 2
end do
nx = 461
nx0 = 1
nz = 421
nz0 = 1
nwork = 1
do i = 1,280
if(nworker*nwork .lt. n_slice) then
nwork = nwork + 1
end if
end do
if (procid .eq. masterid) then
worker_job = 1
do q = 1,nworker
iwork = q
call mpi_send(worker_job, 1, MPI_INTEGER, iwork, tag,
$ MPI_COMM_WORLD,ierror)
call mpi_send(nx0, 1, MPI_INTEGER, iwork, tag,
$ MPI_COMM_WORLD,ierror)
call mpi_send(ny0, 280, MPI_INTEGER, iwork, tag,
$ MPI_COMM_WORLD, ierror)
call mpi_send(nz0, 1, MPI_INTEGER, iwork, tag,
$ MPI_COMM_WORLD,ierror)
call mpi_send(nx, 1, MPI_INTEGER, iwork, tag,
$ MPI_COMM_WORLD, ierror)
call mpi_send(ny, 280, MPI_INTEGER, iwork, tag,
$ MPI_COMM_WORLD,ierror)
call mpi_send(nz, 1, MPI_INTEGER, iwork, tag,
$ MPI_COMM_WORLD, ierror)
worker_job = worker_job + nwork
end do
end if
c ------------------ worker task -----------
if (procid .gt. masterid) then
c write(*,*)'processor',procid,'is working....'
call mpi_recv(worker_job, 1, MPI_INTEGER, masterid, tag,
$ MPI_COMM_WORLD, status, ierror)
call mpi_recv(nx0, 1, MPI_INTEGER, masterid, tag,
$ MPI_COMM_WORLD, status, ierror)
call mpi_recv(ny0, 280, MPI_INTEGER, masterid, tag,
$ MPI_COMM_WORLD, status, ierror)
call mpi_recv(nz0, 1, MPI_INTEGER, masterid, tag,
$ MPI_COMM_WORLD, status, ierror)
call mpi_recv(nx, 1, MPI_INTEGER, masterid, tag,
$ MPI_COMM_WORLD, status, ierror)
call mpi_recv(ny, 280, MPI_INTEGER, masterid, tag,
$ MPI_COMM_WORLD, status, ierror)
call mpi_recv(nz, 1, MPI_INTEGER, masterid, tag,
$ MPI_COMM_WORLD, status, ierror)
do j = worker_job, worker_job + nwork - 1
if (j .le. 280) then
write(*,*) '****************processor',procid,'is working'
call rawmig(j,nx0,ny0(j),nz0,nx,ny(j),nz)
end if
end do
end if
call mpi_finalize(ierror)
end
Problem solved! Thank you all for the comments, finally I've realized that one of the Matrices in the main program must have been synchronized with the new dementions comming from processors! Gilles Gouaillardet, I was trying to make a shortend and readable Version of the program to post it after your Suggestion and during that I saw that this matrice is in the form to build an Output with iy=ny0,ny (the varying Dimension) which must have been iy=1,2. But first of all, the matrice dimensions in the definitions must have been corrected, and because it was defined with the directly comming variables from each processor, some of the processors were stoped without any erorr message!

Resources