Should MPI_IRecv/MPI_ISend have the same `count`? - parallel-processing

Should a pair of MPI_IRecv/MPI_ISend get the same count?
int MPI_Irecv(void *buf, int count, MPI_Datatype datatype, int source,
int tag, MPI_Comm comm, MPI_Request *request)
...
count
number of elements in receive buffer (integer)
The documentation seems to suggest it should not but I am confused by the wording and it is formulated a bit differently from MPI_Recv. I am attaching an example which works as I am expecting if I pass different count.
isend.c
#include <stdio.h>
#include <mpi.h>
#define send_cnt 1
#define recv_cnt 10
#define SEND 0 /* who sends and who receives? */
#define RECV 1
#define TAG 0
#define COMM MPI_COMM_WORLD
MPI_Status status;
MPI_Request request;
void send() {
int dest = RECV;
int buf[] = {42};
MPI_Isend(buf, send_cnt, MPI_INT, dest, TAG, COMM, &request);
MPI_Wait(&request, &status);
}
void recv() {
int dest = SEND;
int buf[123];
MPI_Irecv(buf, recv_cnt, MPI_INT, dest, TAG, COMM, &request);
MPI_Wait(&request, &status);
printf("recv: %d\n", buf[0]);
}
int main(int argc, char *argv[]) {
int rank;
MPI_Init(&argc, &argv);
MPI_Comm_rank(COMM, &rank);
if (rank == SEND) send();
else recv();
MPI_Finalize();
return 0;
}

Both MPI_Recv and MPI_Irecv take as an argument the maximum amount of data elements that are allowed to be written into the buffer. This number does not necessarily have to be equal to the count passed to MPI_Send - it could be larger or smaller. When there is not enough space in the receive buffer to accommodate the message, MPI will signal a truncation error. When there is more space than the message size, only a part of the buffer will be written over. In the latter case one could use MPI_Get_count to examine the MPI status object returned by MPI_Recv / MPI_Test* / MPI_Wait*.
There is a legitimate case of the count passed to MPI_(I)Recv being smaller than the count passed to MPI_Send - different datatypes. For example, one can send 10 MPI_INTs and receive a single element of a contiguous datatype that consists of 10 MPI_INTs. In your case both the send and the receive operations use the same datatype, therefore the receive count must be at least as big as the send count.
By the way, your code is missing a call to wait/test the request created by MPI_Isend, which is erroneous.

Related

Packing Arrays using MPI_Pack

I am trying to pack an array and send it from one process to another. I am doing the operation on only 2 processes. I have written the following code.
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include "mpi.h"
int main( int argc, char *argv[] )
{
MPI_Init(&argc, &argv);
int myrank, size; //size will take care of number of processes
MPI_Comm_rank(MPI_COMM_WORLD, &myrank) ;
MPI_Comm_size(MPI_COMM_WORLD, &size);
//declaring the matrix
double mat[4]={1, 2, 3, 4};
int r=4;
//Now we will send from one matrix to another using MPI_Pack
// For that we will need buffers which will have same number of rows as number of columns
double snd_buf[r];
double recv_buf[r];
double buf[r];
//total size of the data that is beign sent
int outs=r*8;
int position=0;
MPI_Status status[r];
//packing and sending the data
if(myrank==0)
{
for(int i=0;i<r;i++)
{
MPI_Pack(&mat[i], 1 , MPI_DOUBLE,snd_buf,outs,&position,MPI_COMM_WORLD);
}
MPI_Send (&snd_buf, r , MPI_PACKED, 1 /*dest*/ , 100 /*tag*/ , MPI_COMM_WORLD);
}
//receiving the data
if(myrank==1)
{
MPI_Recv(recv_buf, r, MPI_PACKED, 0 /*src*/ , 100 /*tag*/, MPI_COMM_WORLD,&status[0]);
position=0;
for(int i=0;i<r;i++)
{
MPI_Unpack(recv_buf,outs,&position,&buf[i], 1, MPI_DOUBLE, MPI_COMM_WORLD);
}
}
//checking whether the packing in snd_buff is taking place correctly or not
if(myrank==1)
{
for(int i=0;i<r;i++)
{
printf("%lf ",buf[i]);
}
printf("\n");
}
MPI_Finalize();
return 0;
}
I am expecting the output--> 1 2 3 4 but I am only getting 0 0 0 0 in my output.
I was suspecting whether it is a problem of the snd_buffer, but the snd_buffer seems to be fine as it is having all elements 1 2 3 4 correctly.
I have also tried to send and receive like this
//packing and sending the data
if(myrank==0)
{
{
MPI_Pack(&mat[0], 4 , MPI_DOUBLE,snd_buf,outs,&position,MPI_COMM_WORLD);
}
MPI_Send (snd_buf, r , MPI_PACKED, 1 /*dest*/ , 100 /*tag*/ , MPI_COMM_WORLD);
}
//receiving the data
if(myrank==1)
{
MPI_Recv(recv_buf, r, MPI_PACKED, 0 /*src*/ , 100 /*tag*/, MPI_COMM_WORLD,&status[0]);
position=0;
{
MPI_Unpack(recv_buf,outs,&position,&buf[0], 4, MPI_DOUBLE, MPI_COMM_WORLD);
}
Still, this was of no help and the output was only 0s.
I am not able to get why I am facing this error. Any help will be appreciated. Thank you.
Answering my own question. The mistake that I have done was pointed out by Gilles in the comments.
This is the solution to the problem that I have faced
send/recv outs MPI_PACKED (instead of r).
PS--> Consider declaring send_buf and recv_bufas char[] in order to avoid this kind of confusion. char[] won't solve the issue, but make the code more readable (and more obvious sending/receiving r MPI_PACKED is not the right thing to do)

What's the meaning of return val about the "write_packet()/seek()" callback functions in "AVIOContext" struct?

I'm writing a muxer DirectShow Filter using libav, I need to redirect muxer's output to filter's output pin, So I use avio_alloc_context() to create AVIOContext with my write_packet and seek callback functions, these 2 functions are defined below:
int (*write_packet)(void *opaque, uint8_t *buf, int buf_size)
int64_t (*seek)(void *opaque, int64_t offset, int whence)
I can understand the meaning of these functions' input parameters, but what's the meaning of its return? Is it means the bytes written actually?
int (*write_packet)(void *opaque, uint8_t *buf, int buf_size)
Number of bytes written. Negative values indicate error.
int64_t (*seek)(void *opaque, int64_t offset, int whence)
The position of the offset, in bytes, achieved by the seek call, measured from the start of the output file. Negative values indicate error.

MPI Inconsistent Receiver

Could anyone explain what is going on with this point-to-point communication?
I expected that process #1 will always receives from #0, but most of the time the following happened.
Rank[0] Sent
Rank[132802] Got 0.456700
[132802]Source:0 Tag:999999 Error:0
Note: sometimes it works properly though. See below.
Rank[0] Sent
Rank[1] Got 0.456700
[1]Source:0 Tag:999999 Error:0
Compile with mpicc MPITu.c -o out -Wall
Execute with mpiexec -n 5 out 0 1 // I assigned sender and receiver ranks here.
#include<stdio.h>
#include <stdlib.h>
#include"mpi.h"
int main(int argc, char *argv[])
{
int rank, nprocs;
double data, bag;
int sender = atoi( argv[1]);
int target = atoi(argv[2]);
int Max_Count = 100;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Status status;
data = .4567;
int count =10;
if(rank==sender){
MPI_Send(&data,count,MPI_DOUBLE,target,999999,MPI_COMM_WORLD);
printf("Rank[%d] Sent\n",rank );}
else if(rank==target) { MPI_Recv(&bag,Max_Count,MPI_DOUBLE,sender,MPI_ANY_TAG,MPI_COMM_WORLD,&status);
printf("Rank[%d] Got %f\n",rank,bag );
printf("[%d]Source:%d\tTag:%d\tError:%d\n",rank, status.MPI_SOURCE,status.MPI_TAG,status.MP\
I_ERROR);
}
MPI_Finalize();
return 0;
}

Inverting an image using MPI

I am trying to invert a PGM image using MPI. The grayscale (PGM) image should be loaded on the root processor and then be sent to each of the s^2 processors. Each processor will invert a block of the given image, and the inverted blocks will be gathered back on the root processor, which will assemble the blocks into the final image and write it to a PGM image. I ran the following code, but did not get any output. The image was read after running the code, but there was no indication of writing the resultant image. Could you please let me know what could be wrong with it?
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <time.h>
#include <string.h>
#include <math.h>
#include <memory.h>
#define max(x, y) ((x>y) ? (x):(y))
#define min(x, y) ((x<y) ? (x):(y))
int xdim;
int ydim;
int maxraw;
unsigned char *image;
void ReadPGM(FILE*);
void WritePGM(FILE*);
#define s 2
int main(int argc, char **argv) {
MPI_Init(&argc, &argv);
int p, rank;
MPI_Comm_size(MPI_COMM_WORLD, &p);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
const int NPROWS=s; /* number of rows in _decomposition_ */
const int NPCOLS=s; /* number of cols in _decomposition_ */
const int BLOCKROWS = xdim/NPROWS; /* number of rows in _block_ */
const int BLOCKCOLS = ydim/NPCOLS; /* number of cols in _block_ */
int i, j;
FILE *fp;
float BLimage[BLOCKROWS*BLOCKCOLS];
for (int ii=0; ii<BLOCKROWS*BLOCKCOLS; ii++)
BLimage[ii] = 0;
float BLfilteredMat[BLOCKROWS*BLOCKCOLS];
for (int ii=0; ii<BLOCKROWS*BLOCKCOLS; ii++)
BLfilteredMat[ii] = 0;
if (rank == 0) {
/* begin reading PGM.... */
ReadPGM(fp);
}
MPI_Datatype blocktype;
MPI_Datatype blocktype2;
MPI_Type_vector(BLOCKROWS, BLOCKCOLS, ydim, MPI_FLOAT, &blocktype2);
MPI_Type_create_resized( blocktype2, 0, sizeof(float), &blocktype);
MPI_Type_commit(&blocktype);
int disps[NPROWS*NPCOLS];
int counts[NPROWS*NPCOLS];
for (int ii=0; ii<NPROWS; ii++) {
for (int jj=0; jj<NPCOLS; jj++) {
disps[ii*NPCOLS+jj] = ii*ydim*BLOCKROWS+jj*BLOCKCOLS;
counts [ii*NPCOLS+jj] = 1;
}
}
MPI_Scatterv(image, counts, disps, blocktype, BLimage, BLOCKROWS*BLOCKCOLS, MPI_FLOAT, 0, MPI_COMM_WORLD);
//************** Invert the block **************//
for (int proc=0; proc<p; proc++) {
if (proc == rank) {
for (int j = 0; j < BLOCKCOLS; j++) {
for (int i = 0; i < BLOCKROWS; i++) {
BLfilteredMat[j*BLOCKROWS+i] = 255 - image[j*BLOCKROWS+i];
}
}
} // close if (proc == rank) {
MPI_Barrier(MPI_COMM_WORLD);
} // close for (int proc=0; proc<p; proc++) {
MPI_Gatherv(BLfilteredMat, BLOCKROWS*BLOCKCOLS,MPI_FLOAT, image, counts, disps,blocktype, 0, MPI_COMM_WORLD);
if (rank == 0) {
/* Begin writing PGM.... */
WritePGM(fp);
free(image);
}
MPI_Finalize();
return (1);
}
It is very likely MPI is not the right tool for the job. The reason for this is that your job is inherently bandwidth limited.
Think of it this way: You have a coloring book with images which you all want to color in.
Method 1: you take your time and color them in one by one.
Method 2: you copy each page to a new sheet of paper and mail it to a friend who then colors it in for you. He mails it back to you and in the end you glue all the pages you received from all of your friends together to make one colored-in book.
Note that method two involves copying the whole book, which is arguably the same amount of work needed to color in the whole book. So method two is less time-efficient without even considering the overhead of shoving the pages into an envelope, licking the stamp, going to the post office and waiting for the letter to be delivered.
If you look at your code, every transmitted byte is only touched once throughout the whole program in this line:
BLfilteredMat[j*BLOCKROWS+i] = 255 - image[j*BLOCKROWS+i];
The single processor is much faster at subtracting two integers than it is at sending an integer of the wire, therefore one must advise against using MPI for your particular problem.
My suggestion to solve your problem: Try to avoid unneccessary communication whenever possible. Do all processes have access to the file system on which the files are located? You could try reading them directly from the filesystem.

how to fix this MPI code program

This program demonstrates an unsafe program, because sometimes it will execute fine, and other times it will fail. The reason why the program fails or hangs is due to buffer exhaustion on the receiving task side, as a consequence of the way an MPI library has implemented an eager protocol for messages of a certain size. One possible solution is to include an MPI_Barrier call in the both the send and receive loops.
how its program code is correct???
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#define MSGSIZE 2000
int main (int argc, char *argv[])
{
int numtasks, rank, i, tag=111, dest=1, source=0, count=0;
char data[MSGSIZE];
double start, end, result;
MPI_Status status;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank == 0) {
printf ("mpi_bug5 has started...\n");
if (numtasks > 2)
printf("INFO: Number of tasks= %d. Only using 2 tasks.\n", numtasks);
}
/******************************* Send task **********************************/
if (rank == 0) {
/* Initialize send data */
for(i=0; i<MSGSIZE; i++)
data[i] = 'x';
start = MPI_Wtime();
while (1) {
MPI_Send(data, MSGSIZE, MPI_BYTE, dest, tag, MPI_COMM_WORLD);
count++;
if (count % 10 == 0) {
end = MPI_Wtime();
printf("Count= %d Time= %f sec.\n", count, end-start);
start = MPI_Wtime();
}
}
}
/****************************** Receive task ********************************/
if (rank == 1) {
while (1) {
MPI_Recv(data, MSGSIZE, MPI_BYTE, source, tag, MPI_COMM_WORLD, &status);
/* Do some work - at least more than the send task */
result = 0.0;
for (i=0; i < 1000000; i++)
result = result + (double)random();
}
}
MPI_Finalize();
}
Ways to improve this code so that the receiver doesn't end up with an unlimited number of unexpected messages include:
Synchronization - you mentioned MPI_Barrier, but even using MPI_Ssend instead of MPI_Send would work.
Explicit buffering - the use of MPI_Bsend or Brecv to ensure adequate buffering exists.
Posted receives - the receiving process posts IRecvs before starting work to ensure that the messages are received into the buffers meant to hold the data, rather than system buffers.
In this pedagogical case, since the number of messages is unlimited, only the first (synchronization) would reliably work.

Resources