MPI_Put does not work when use MPI_Win_create_dynamic, why? - mpic++

I use MPI_Put together with MPI_Win_create_dynamic, but it does not work, just stuck before MPI_Win_fence, could not go through, I do not know why?
But when I add MPI_Win_flush, I just got the following errors:
[susans-MacBook-Pro:05235] *** An error occurred in MPI_Win_flush
[susans-MacBook-Pro:05235] *** reported by process [3117416449,1]
[susans-MacBook-Pro:05235] *** on win pt2pt window 3
[susans-MacBook-Pro:05235] *** MPI_ERR_RMA_SYNC: error executing rma sync
[susans-MacBook-Pro:05235] *** MPI_ERRORS_ARE_FATAL (processes in this win will now abort,
[susans-MacBook-Pro:05235] *** and potentially your MPI job)
Is there anything wrong with MPI_Put call?
The source coe is as follows:
```mpi-c++
#include <stdio.h>
#include <iostream>
#include "mpi.h"
using namespace std;
#define NROWS 10
#define NCOLS 10
int main(int argc, char *argv[])
{
int rank, nprocs, A[NROWS][NCOLS], i, j;
MPI_Win win;
MPI_Datatype column, xpose;
int errs = 0;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&nprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
MPI_Win_create_dynamic(MPI_INFO_NULL, MPI_COMM_WORLD, &win);
if(rank==0)
{ /* rank = 0*/
for (i=0; i<NROWS; i++)
for (j=0; j<NCOLS; j++)
A[i][j] = i*NCOLS + j;
MPI_Win_attach(win, A, NROWS*NCOLS*sizeof(int));
}
MPI_Win_fence(0, win);
if (rank > 0)
{
for (i=0; i<NROWS; i++)
for (j=0; j<NCOLS; j++)
A[i][j] = -1;
int target=0,disp=0;
MPI_Get(A, NROWS*NCOLS, MPI_INT, target, disp, NROWS*NCOLS, MPI_INT, win)!=MPI_SUCCESS)
MPI_Win_flush(target,win);
MPI_Win_fence(0, win);
}
else if(rank==0)
{ /* rank = 0 */
MPI_Win_fence(0, win);
MPI_Win_detach(win,A);
}
MPI_Win_free(&win);
MPI_Finalize();
return errs;
}
```

It seems that if I use MPI_Get_address to get the disp of the shared memory, then bcast it ,then it works well.
MPI_Get_address(buf_shared, &disp);
MPI_Bcast(&disp, 1, MPI_AINT, 0, comm);

Related

MPI - scattering filepaths to processes

I have 4 filepaths in the global_filetable and I am trying to scatter 2 pilepaths to each process.
The process 0 have proper 2 paths, but there is something strange in the process 1 (null)...
EDIT:
Here's the full code:
#include <stdio.h>
#include <limits.h> // PATH_MAX
#include <mpi.h>
int main(int argc, char *argv[])
{
char** global_filetable = (char**)malloc(4 * PATH_MAX * sizeof(char));
for(int i = 0; i < 4; ++i) {
global_filetable[i] = (char*)malloc(PATH_MAX *sizeof(char));
strncpy (filetable[i], "/path/", PATH_MAX);
}
/*for(int i = 0; i < 4; ++i) {
printf("%s\n", global_filetable[i]);
}*/
int rank, size;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
char** local_filetable = (char**)malloc(2 * PATH_MAX * sizeof(char));
MPI_Scatter(global_filetable, 2*PATH_MAX, MPI_CHAR, local_filetable, 2*PATH_MAX , MPI_CHAR, 0, MPI_COMM_WORLD);
{
/* now all processors print their local data: */
for (int p = 0; p < size; ++p) {
if (rank == p) {
printf("Local process on rank %d is:\n", rank);
for (int i = 0; i < 2; i++) {
printf("path: %s\n", local_filetable[i]);
}
}
MPI_Barrier(MPI_COMM_WORLD);
}
}
MPI_Finalize();
return 0;
}
Output:
Local process on rank 0 is:
path: /path/
path: /path/
Local process on rank 1 is:
path: (null)
path: (null)
Do you have any idea why I am having those nulls?
First, your allocation is inconsistent:
char** local_filetable = (char**)malloc(2 * PATH_MAX * sizeof(char));
The type char** indicates an array of char*, but you allocate a contiguous memory block, which would indicate a char*.
The easiest way would be to use the contiguous memory as char* for both global and local filetables. Depending on what get_filetable() actually does, you may have to convert. You can then index it like this:
char* entry = &filetable[i * PATH_MAX]
You can then simply scatter like this:
MPI_Scatter(global_filetable, 2 * PATH_MAX, MPI_CHAR,
local_filetable, 2 * PATH_MAX, MPI_CHAR, 0, MPI_COMM_WORLD);
Note that there is no more displacement, every rank just gets an equal sized chunk of the contiguous memory.
The next step would be to define a C and MPI struct encapsulating PATH_MAX characters so you can get rid of the constant usage of PATH_MAX and crude indexing.
I think this is much nicer (less complex, less memory management) than using actual char**. You would only need that if memory waste or redundant data transfer becomes an issue.
P.S. Make sure to never put in more than PATH_MAX - 1 characters in an filetable entry to keep space for the tailing \0.
Okay, I'm stupid.
char global_filetable[NUMBER_OF_STRINGS][PATH_MAX];
for(int i = 0; i < 4; ++i) {
strcpy (filetable[i], "/path/");
}
char local_filetable[2][PATH_MAX];
Now it works!

Error when using MPI_SEND and MPI_RECV on Windows

I am following a source code on my documents, but I encounter an error when I try to use MPI_Send() and MPI_Recv() from Open MPI library.
I have googled and read some threads in this site but I can not find the solution to resolve my error.
This is my error:
mca_oob_tcp_msg_recv: readv faled : Unknown error (108)
Here is details image:
And this is the code that I'm following:
#include <stdio.h>
#include <string.h>
#include <conio.h>
#include <mpi.h>
int main(int argc, char **argv) {
int rank, size, mesg, tag = 123;
MPI_Status status;
MPI_Init(&argv, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (size < 2) {
printf("Need at least 2 processes!\n");
} else if (rank == 0) {
mesg = 11;
MPI_Send(&mesg,1,MPI_INT,1,tag,MPI_COMM_WORLD);
MPI_Recv(&mesg,1,MPI_INT,1,tag,MPI_COMM_WORLD,&status);
printf("Rank 0 received %d from rank 1\n",mesg);
} else if (rank == 1) {
MPI_Recv(&mesg,1,MPI_INT,0,tag,MPI_COMM_WORLD,&status);
printf("Rank 1 received %d from rank 0/n",mesg);
mesg = 42;
MPI_Send(&mesg,1,MPI_INT,0,tag,MPI_COMM_WORLD);
}
MPI_Finalize();
return 0;
}
I commented all of MPI_Send(), and MPI_Recv(), and my program worked. In other hand, I commented either MPI_Send() or MPI_Recv(), and I still got that error. So I think the problem are MPI_Send() and MPI_Recv() functions.
P.S.: I'm using Open MPI v1.6 on Windows 8.1 OS.
You pass in the wrong arguments to MPI_Init (two times argv, instead of argc and argv once each).
The sends and receives actually look fine, I think. But there is also one typo in one of your prints with a /n instead of \n.
Here is what works for me (on MacOSX, though):
int main(int argc, char **argv) {
int rank, size, mesg, tag = 123;
MPI_Status status;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (size < 2) {
printf("Need at least 2 processes!\n");
} else if (rank == 0) {
mesg = 11;
MPI_Send(&mesg,1,MPI_INT,1,tag,MPI_COMM_WORLD);
MPI_Recv(&mesg,1,MPI_INT,1,tag,MPI_COMM_WORLD,&status);
printf("Rank 0 received %d from rank 1\n",mesg);
} else if (rank == 1) {
MPI_Recv(&mesg,1,MPI_INT,0,tag,MPI_COMM_WORLD,&status);
printf("Rank 1 received %d from rank 0\n",mesg);
mesg = 42;
MPI_Send(&mesg,1,MPI_INT,0,tag,MPI_COMM_WORLD);
}
MPI_Finalize();
return 0;
}
If this does not work, I'd guess your OS does not let the processes communicate with each other via the method chosen by OpenMPI.
Set MPI_STATUS_IGNORED instead of &status in MPI_Recv in both places.

_CrtlsValidHeapPointer(PuserData), Debug Assertion Failed Visual C++ (MPI)

I am using MPI_Reduce and MPI_Scatter function to scatter an array of integers in "N" processors and printing the partial and accumulate sum of the array. I am using Microsoft MPI (MSMPI) on visual studio 2010. but each time at execution it gives an exception " _CrtlsValidHeapPointer(PuserData)" along the title "Debug Assertion Failed" The code is as under
enter code here
#include <mpi.h>
#include<iostream>
using namespace std;
int main(int argc, char *argv[]) {
int size;
int rank;
int partialsum=0;
int root =0;
int accum=0;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
int *globaldata = NULL;
int *localdata = new int(4);
if (rank == root) {
globaldata = new int(size*4);
for (int i=0; i<(size*4); i++)
globaldata[i] = 2*i+1;
cout<<"Processor"<<rank<<" has global data: ";
for (int i=0; i<(size*4); i++)
cout<<globaldata[i]<<" ";
cout<<"\n";
}
MPI_Scatter(globaldata, 4, MPI_INT, localdata, 4, MPI_INT, root, MPI_COMM_WORLD);
cout<<"Processor "<<rank<<"has local data";
for(int i=0; i<4;i++)
cout<<" "<<localdata[i];
cout<<endl;
for(int k=0;k<4;k++)
partialsum += localdata[k];
cout<<"Processor "<<rank<<" Partial Sum = "<<partialsum<<"\n";
MPI_Reduce(&partialsum,&accum,1,MPI_INT,MPI_SUM, root,MPI_COMM_WORLD);
if (rank == 0) {
cout<<"Processor "<<rank<<" Accumulated Sum = "<<accum;
}
MPI_Finalize();
return 0;
}
The error is very simple and lies here:
globaldata = new int(size*4);
The syntax to allocate dynamic arrays with the new operator is new type[size]:
globaldata = new int[size*4];
In your case a space for a single int is allocated and set to size*4 instead and the initialisation code that immediately follows the allocation of memory at the root overwrites past the end of the allocated memory, thus destroying the heap structure.

Single-Sided communications with MPI-2

Consider the following fragment of OpenMP code which transfers private data between two threads using an intermediate shared variable
#pragma omp parallel shared(x) private(a,b)
{
...
a = somefunction(b);
if (omp_get_thread_num() == 0) {
x = a;
}
}
#pragma omp parallel shared(x) private(a,b)
{
if (omp_get_thread_num() == 1) {
a = x;
}
b = anotherfunction(a);
...
}
I would (in pseudocode ) need to transfer of private data from one process to another using a single-sided message-passing library.
Any ideas?
This is possible, but there's a lot more "scaffolding" involved -- after all, you are communicating data between potentially completely different computers.
The coordination for this sort of thing is done between windows of data which are accessible from other processors, and with lock/unlock operations which coordinate the access of this data. The locks aren't really locks in the sense of being mutexes, but they are more like synchronization points coordinating data access to the window.
I don't have time right now to explain this in the detail I'd like, but below is an example of using MPI2 to do something like shared memory flagging in a system that doesn't have shared memory:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include "mpi.h"
int main(int argc, char** argv)
{
int rank, size, *a, geta;
int x;
int ierr;
MPI_Win win;
const int RCVR=0;
const int SENDER=1;
ierr = MPI_Init(&argc, &argv);
ierr |= MPI_Comm_rank(MPI_COMM_WORLD, &rank);
ierr |= MPI_Comm_size(MPI_COMM_WORLD, &size);
if (ierr) {
fprintf(stderr,"Error initializing MPI library; failing.\n");
exit(-1);
}
if (rank == RCVR) {
MPI_Alloc_mem(sizeof(int), MPI_INFO_NULL, &a);
*a = 0;
} else {
a = NULL;
}
MPI_Win_create(a, 1, sizeof(int), MPI_INFO_NULL, MPI_COMM_WORLD, &win);
if (rank == SENDER) {
/* Lock recievers window */
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, RCVR, 0, win);
x = 5;
/* put 1 int (from &x) to 1 int rank RCVR, at address 0 in window "win"*/
MPI_Put(&x, 1, MPI_INT, RCVR, 0, 1, MPI_INT, win);
/* Unlock */
MPI_Win_unlock(0, win);
printf("%d: My job here is done.\n", rank);
}
if (rank == RCVR) {
for (;;) {
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, RCVR, 0, win);
MPI_Get(&geta, 1, MPI_INT, RCVR, 0, 1, MPI_INT, win);
MPI_Win_unlock(0, win);
if (geta == 0) {
printf("%d: a still zero; sleeping.\n",rank);
sleep(2);
} else
break;
}
printf("%d: a now %d!\n",rank,geta);
printf("a = %d\n", *a);
MPI_Win_free(&win);
if (rank == RCVR) MPI_Free_mem(a);
MPI_Finalize();
return 0;
}

MPI matrix multification compile err: undeclared with code

I coded a mpi matrix multification program, which use scanf("%d", &size), designate matrix size, then I defined int matrix[size*size], but when I complied it, it reported that matrix is undeclared. Please tell me why, or what my problem is!
According Ed's suggestion, I changed the matrix definition to if(myid == 0) block, but got the same err! Now I post my code, please help me find out where I made mistakes! thank you!
int size;
int main(int argc, char* argv[]) {
int myid, numprocs;
int *p;
MPI_Status status;
int i,j,k;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
if(myid == 0)
{
scanf("%d", &size);
int matrix1[size*size];
int matrix2[size*size];
int matrix3[size*size];
int section = size/numprocs;
int tail = size % numprocs;
srand((unsigned)time(NULL));
for( i=0; i<size; i++)
for( j=0; j<size; j++)
{
matrix1[i*size+j]=rand()%9;
matrix3[i*size+j]= 0;
matrix2[i*size+j]=rand()%9;
}
printf("Matrix1 is: \n");
for( i=0; i<size; i++)
{
for( j=0; j<size; j++)
{
printf("%3d", matrix1[i*size+j]);
}
printf("\n");
}
printf("\n");
printf("Matrix2 is: \n");
Reformatted code would be nice...
One problem is that you haven't declared the size variable. Another problem is that the [size] notation for declaring arrays is only good for sizes that are known at compile time. You want to use malloc() instead.
You don't actually need to define a MAX_SIZE if you use dynamic memory allocation.
#include <stdio.h>
#include <stdlib.h>
...
scanf("%d", &size);
int *matrix1 = (int *) malloc(size*size*sizeof(int));
int *matrix2 = (int *) malloc(size*size*sizeof(int));
int *matrix3 = (int *) malloc(size*size*sizeof(int));
...

Resources