MPI: load balancing algorithm (Master-Slave model) - algorithm

I'm using MPI to parallelize a loop [0,max]. I want a master process (let's say process 0) to initially divide that loop into small sets of n tasks (n iterations) and then progressively affect a set of tasks to x slave processes whenever one finishes its previous work (previous set of task). In other words, I would like to implement a load balancing algorithm using MPI's blocking and/or non blocking Send/Receives but have no idea how I could proceed. Also, Is there a way to find the optimal size of one set of tasks (the n parameter) in function of "max" and "x" ?
Thanks a lot for your help.

I've finally found the following code from here which is basically a skeleton for dynamic load balancing based on MPI Master/Slave model, exactly what I was looking for. I still can't see how to divide optimally the initial work set though.
#include <mpi.h>
#define WORKTAG 1
#define DIETAG 2
main(argc, argv)
int argc;
char *argv[];
{
int myrank;
MPI_Init(&argc, &argv); /* initialize MPI */
MPI_Comm_rank(
MPI_COMM_WORLD, /* always use this */
&myrank); /* process rank, 0 thru N-1 */
if (myrank == 0) {
master();
} else {
slave();
}
MPI_Finalize(); /* cleanup MPI */
}
master()
{
int ntasks, rank, work;
double result;
MPI_Status status;
MPI_Comm_size(
MPI_COMM_WORLD, /* always use this */
&ntasks); /* #processes in application */
/*
* Seed the slaves.
*/
for (rank = 1; rank < ntasks; ++rank) {
work = /* get_next_work_request */;
MPI_Send(&work, /* message buffer */
1, /* one data item */
MPI_INT, /* data item is an integer */
rank, /* destination process rank */
WORKTAG, /* user chosen message tag */
MPI_COMM_WORLD);/* always use this */
}
/*
* Receive a result from any slave and dispatch a new work
* request work requests have been exhausted.
*/
work = /* get_next_work_request */;
while (/* valid new work request */) {
MPI_Recv(&result, /* message buffer */
1, /* one data item */
MPI_DOUBLE, /* of type double real */
MPI_ANY_SOURCE, /* receive from any sender */
MPI_ANY_TAG, /* any type of message */
MPI_COMM_WORLD, /* always use this */
&status); /* received message info */
MPI_Send(&work, 1, MPI_INT, status.MPI_SOURCE,
WORKTAG, MPI_COMM_WORLD);
work = /* get_next_work_request */;
}
/*
* Receive results for outstanding work requests.
*/
for (rank = 1; rank < ntasks; ++rank) {
MPI_Recv(&result, 1, MPI_DOUBLE, MPI_ANY_SOURCE,
MPI_ANY_TAG, MPI_COMM_WORLD, &status);
}
/*
* Tell all the slaves to exit.
*/
for (rank = 1; rank < ntasks; ++rank) {
MPI_Send(0, 0, MPI_INT, rank, DIETAG, MPI_COMM_WORLD);
}
}
slave()
{
double result;
int work;
MPI_Status status;
for (;;) {
MPI_Recv(&work, 1, MPI_INT, 0, MPI_ANY_TAG,
MPI_COMM_WORLD, &status);
/*
* Check the tag of the received message.
*/
if (status.MPI_TAG == DIETAG) {
return;
}
result = /* do the work */;
MPI_Send(&result, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD);
}
}

Related

MPI hangs during execution

I'm trying to write a simple program with MPI that finds all numbers less than 514, that are equal to the exponent of the sum of their digits(for example, 512 = (5+1+2)^3. The problem I have is with the main loop - it works just fine on a few iterations(c=10), but when I try to increase the number of iterations(c=x), mpiexec.exe just hangs - seemingly in the middle of printf routine.
I'm pretty sure that deadlocks are to blame, but I couldn't find any.
The source code:
#include <stdlib.h>
#include <stdio.h>
#include <iostream>
#include "mpi.h"
int main(int argc, char* argv[])
{
//our number
int x=514;
//amount of iterations
int c = 10;
//tags for message identification
int tag = 42;
int tagnumber = 43;
int np, me, y1, y2;
MPI_Status status;
/* Initialize MPI */
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &np);
MPI_Comm_rank(MPI_COMM_WORLD, &me);
/* Check that we run on more than two processors */
if (np < 2)
{
printf("You have to use at least 2 processes to run this program\n");
MPI_Finalize();
exit(0);
}
//begin iterations
while(c>0)
{
//if main thread, then send messages to all created threads
if (me == 0)
{
printf("Amount of threads: %d\n", np);
int b = 1;
while(b<np)
{
int q = x-b;
//sends a number to a secondary thread
MPI_Send(&q, 1, MPI_INT, b, tagnumber, MPI_COMM_WORLD);
printf("Process %d sending to process %d, value: %d\n", me, b, q);
//get a number from secondary thread
MPI_Recv(&y2, 1, MPI_INT, b, tag, MPI_COMM_WORLD, &status);
printf ("Process %d received value %d\n", me, y2);
//compare it with the sent one
if (q==y2)
{
//if they're equal, then print the result
printf("\nValue found: %d\n", q);
}
b++;
}
x = x-b+1;
b = 1;
}
else
{
//if not a main thread, then process the message sent and send the result back.
MPI_Recv (&y1, 1, MPI_INT, 0, tagnumber, MPI_COMM_WORLD, &status);
int sum = 0;
int y2 = y1;
while (y1!=0)
{
//find the number's sum of digits
sum += y1%10;
y1 /= 10;
}
int sum2 = sum;
while(sum2<y2)
{
//calculate the exponentiation
sum2 = sum2*sum;
}
MPI_Send (&sum2, 1, MPI_INT, 0, tag, MPI_COMM_WORLD);
}
c--;
}
MPI_Finalize();
exit(0);
}
And I run the compiled exe-file as "mpiexec.exe -n 4 lab2.exe". I use HPC Pack 2008 SDK, if that's of any use to you guys.
Is there any way to fix it? Or maybe some way to debug that situation properly?
Thanks a lot in advance!
Not sure if you already found where's the problem, but your infinite run happens in this loop:
while(sum2<y2)
{
//calculate the exponentiation
sum2 = sum2*sum;
}
You can confirm this by setting c to about 300 or above then make a printf call in this while loop. I haven't completely pinpoint your error of logic, but I marked three comments below at your code location where I feel is strange:
while(c>0)
{
if (me == 0)
{
...
while(b<np)
{
int q = x-b; //<-- you subtract b from x here
...
b++;
}
x = x-b+1; //<-- you subtract b again. sure this is what you want?
b = 1; //<-- this is useless
}
Hope this helps.

mpi parallel program to find prime numbers. Please help me dubug

I wrote the following program to find prime number with the #defined value. It is parallel program using mpi. Can anyone help me find a error in it. It compile well but crashes while executing.
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#define N 65
int rank, size;
double start_time;
double end_time;
int y, x, i, port1, port2, port3;
int check =0; // prime number checker, if a number is prime it always remains 0 through out calculation. for a number which is not prime it is turns to value 1 at some point
int signal =0; // has no important use. just to check if slave process work is done.
MPI_Status status;
MPI_Request request;
int main(int argc, char *argv[]){
MPI_Init(&argc, &argv); //initialize MPI operations
MPI_Comm_rank(MPI_COMM_WORLD, &rank); //get the rank
MPI_Comm_size(MPI_COMM_WORLD, &size); //get number of processes
if(rank == 0){ // master process divides work and also does initial work itself
start_time = MPI_Wtime();
printf("2\n"); //print prime number 2 first because the algorithm for finding the prime number in this program is just for odd number
port1 = (N/(size-1)); // calculating the suitable amount of work per process
for(i=1;i<size-1;i++){ // master sending the portion of work to each slave
port2 = port1 * i; // lower bound of work for i th process
port3 = ((i+1)*port1)-1; // upper bound of work for i th process
MPI_Isend(&port2, 1, MPI_INT, i, 100, MPI_COMM_WORLD, &request);
MPI_Isend(&port3, 1, MPI_INT, i, 101, MPI_COMM_WORLD, &request);
}
port2 = (size-1)*port1; port3= N; // the last process takes the remaining work
MPI_Isend(&port2, 1, MPI_INT, (size-1), 100, MPI_COMM_WORLD, &request);
MPI_Isend(&port3, 1, MPI_INT, (size-1), 101, MPI_COMM_WORLD, &request);
for(x = 3; x < port1; x=x+2){ // master doing initial work by itself
check = 0;
for(y = 3; y <= x/2; y=y+2){
if(x%y == 0) {check =1; break;}
}
if(check==0) printf("%d\n", x);
}
}
if (rank > 0){ // slave working part
MPI_Recv(&port2,1,MPI_INT, 0, 100, MPI_COMM_WORLD, &status);
MPI_Recv(&port3,1,MPI_INT, 0, 101, MPI_COMM_WORLD, &status);
if (port2%2 == 0) port2++; // changing the even argument to odd to make the calculation fast because even number is never a prime except 2.
for(x=port2; x<=port3; x=x+2){
check = 0;
for(y = 3; y <= x/2; y=y+2){
if(x%y == 0) {check =1; break;}
}
if (check==0) printf("%d\n",x);
}
signal= rank;
MPI_Isend(&signal, 1, MPI_INT, 0, 103, MPI_COMM_WORLD, &request); // just informing master that the work is finished
}
if (rank == 0){ // master concluding the work and printing the time taken to do the work
for(i== 1; i < size; i++){
MPI_Recv(&signal,1,MPI_INT, i, 103, MPI_COMM_WORLD, &status); // master confirming that all slaves finished their work
}
end_time = MPI_Wtime();
printf("\nRunning Time = %f \n\n", end_time - start_time);
}
MPI_Finalize();
return 0;
}
I got following error
mpirun -np 2 ./a.exe
Exception: STATUS_ACCESS_VIOLATION at eip=0051401C
End of stack trace
I found what was wrong with my program.
It was the use of the restricted variable signal. change the name of that variable (in all places it is used) to any other viable name and it works.

A warning when debugging a parallel processing in MPI?

I have codes below :
#include <stdio.h>
#include "mpi.h"
#define NRA 512 /* number of rows in matrix A */
#define NCA 512 /* number of columns in matrix A */
#define NCB 512 /* number of columns in matrix B */
#define MASTER 0 /* taskid of first task */
#define FROM_MASTER 1 /* setting a message type */
#define FROM_WORKER 2 /* setting a message type */
MPI_Status status;
double a[NRA][NCA], /* matrix A to be multiplied */
b[NCA][NCB], /* matrix B to be multiplied */
c[NRA][NCB]; /* result matrix C */
main(int argc, char **argv)
{
int numtasks, /* number of tasks in partition */
taskid, /* a task identifier */
numworkers, /* number of worker tasks */
source, /* task id of message source */
dest, /* task id of message destination */
nbytes, /* number of bytes in message */
mtype, /* message type */
intsize, /* size of an integer in bytes */
dbsize, /* size of a double float in bytes */
rows, /* rows of matrix A sent to each worker */
averow, extra, offset, /* used to determine rows sent to each worker */
i, j, k, /* misc */
count;
double t1,t2;
intsize = sizeof(int);
dbsize = sizeof(double);
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &taskid);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
numworkers = numtasks-1;
//printf(" size of matrix A = %d by %d\n",NRA,NCA);
//printf(" size of matrix B = %d by %d\n",NRA,NCB);
/*---------------------------- master ----------------------------*/
if (taskid == MASTER) {
printf("Number of worker tasks = %d\n",numworkers);
for (i=0; i<NRA; i++)
for (j=0; j<NCA; j++)
a[i][j]= i+j;
for (i=0; i<NCA; i++)
for (j=0; j<NCB; j++)
b[i][j]= i*j;
t1 = MPI_Wtime();
/* send matrix data to the worker tasks */
averow = NRA/numworkers;
extra = NRA%numworkers;
offset = 0;
mtype = FROM_MASTER;
for (dest=1; dest<=numworkers; dest++) {
rows = (dest <= extra) ? averow+1 : averow;
//printf(" Sending %d rows to task %d\n",rows,dest);
MPI_Send(&offset, 1, MPI_INT, dest, mtype, MPI_COMM_WORLD);
MPI_Send(&rows, 1, MPI_INT, dest, mtype, MPI_COMM_WORLD);
count = rows*NCA;
MPI_Send(&a[offset][0], count, MPI_DOUBLE, dest, mtype, MPI_COMM_WORLD);
count = NCA*NCB;
MPI_Send(&b, count, MPI_DOUBLE, dest, mtype, MPI_COMM_WORLD);
offset = offset + rows;
}
/* wait for results from all worker tasks */
mtype = FROM_WORKER;
for (i=1; i<=numworkers; i++) {
source = i;
MPI_Recv(&offset, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
MPI_Recv(&rows, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
count = rows*NCB;
MPI_Recv(&c[offset][0], count, MPI_DOUBLE, source, mtype, MPI_COMM_WORLD,
&status);
}
#ifdef PRINT
printf("Here is the result matrix\n");
for (i=0; i<NRA; i++) {
printf("\n");
for (j=0; j<NCB; j++)
printf("%6.2f ", c[i][j]);
}
printf ("\n");
#endif
t2 = MPI_Wtime();
fprintf(stdout,"Time = %.6f\n\n",
t2-t1);
} /* end of master section */
/*---------------------------- worker (slave)----------------------------*/
if (taskid > MASTER) {
mtype = FROM_MASTER;
source = MASTER;
#ifdef PRINT
printf ("Master =%d, mtype=%d\n", source, mtype);
#endif
MPI_Recv(&offset, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
#ifdef PRINT
printf ("offset =%d\n", offset);
#endif
MPI_Recv(&rows, 1, MPI_INT, source, mtype, MPI_COMM_WORLD, &status);
#ifdef PRINT
printf ("row =%d\n", rows);
#endif
count = rows*NCA;
MPI_Recv(&a, count, MPI_DOUBLE, source, mtype, MPI_COMM_WORLD, &status);
#ifdef PRINT
printf ("a[0][0] =%e\n", a[0][0]);
#endif
count = NCA*NCB;
MPI_Recv(&b, count, MPI_DOUBLE, source, mtype, MPI_COMM_WORLD, &status);
#ifdef PRINT
printf ("b=\n");
#endif
for (k=0; k<NCB; k++)
for (i=0; i<rows; i++) {
c[i][k] = 0.0;
for (j=0; j<NCA; j++)
c[i][k] = c[i][k] + a[i][j] * b[j][k];
}
//mtype = FROM_WORKER;
#ifdef PRINT
printf ("after computer\n");
#endif
//MPI_Send(&offset, 1, MPI_INT, MASTER, mtype, MPI_COMM_WORLD);
MPI_Send(&offset, 1, MPI_INT, MASTER, FROM_WORKER, MPI_COMM_WORLD);
//MPI_Send(&rows, 1, MPI_INT, MASTER, mtype, MPI_COMM_WORLD);
MPI_Send(&rows, 1, MPI_INT, MASTER, FROM_WORKER, MPI_COMM_WORLD);
//MPI_Send(&c, rows*NCB, MPI_DOUBLE, MASTER, mtype, MPI_COMM_WORLD);
MPI_Send(&c, rows*NCB, MPI_DOUBLE, MASTER, FROM_WORKER, MPI_COMM_WORLD);
#ifdef PRINT
printf ("after send\n");
#endif
} /* end of worker */
MPI_Finalize();
} /* end of main */
The codes are matrix multiplication using MPI. When i try to debug it using visual studio 2010 express : it's display a warning
I want to ask, where was the problem during debugging the code? Does anyone can help me?
The error is at this line:
averow = NRA/numworkers;
numworkers is 0, presumably because you haven't configured Visual Studio to launch the MPI job with more processes. It's pretty fiddly to get it to do the right thing here, especially when debugging.
Make sure the MPI Cluster Debugger is installed correctly - this is the most likely culprit.

how to fix this MPI code program

This program demonstrates an unsafe program, because sometimes it will execute fine, and other times it will fail. The reason why the program fails or hangs is due to buffer exhaustion on the receiving task side, as a consequence of the way an MPI library has implemented an eager protocol for messages of a certain size. One possible solution is to include an MPI_Barrier call in the both the send and receive loops.
how its program code is correct???
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#define MSGSIZE 2000
int main (int argc, char *argv[])
{
int numtasks, rank, i, tag=111, dest=1, source=0, count=0;
char data[MSGSIZE];
double start, end, result;
MPI_Status status;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank == 0) {
printf ("mpi_bug5 has started...\n");
if (numtasks > 2)
printf("INFO: Number of tasks= %d. Only using 2 tasks.\n", numtasks);
}
/******************************* Send task **********************************/
if (rank == 0) {
/* Initialize send data */
for(i=0; i<MSGSIZE; i++)
data[i] = 'x';
start = MPI_Wtime();
while (1) {
MPI_Send(data, MSGSIZE, MPI_BYTE, dest, tag, MPI_COMM_WORLD);
count++;
if (count % 10 == 0) {
end = MPI_Wtime();
printf("Count= %d Time= %f sec.\n", count, end-start);
start = MPI_Wtime();
}
}
}
/****************************** Receive task ********************************/
if (rank == 1) {
while (1) {
MPI_Recv(data, MSGSIZE, MPI_BYTE, source, tag, MPI_COMM_WORLD, &status);
/* Do some work - at least more than the send task */
result = 0.0;
for (i=0; i < 1000000; i++)
result = result + (double)random();
}
}
MPI_Finalize();
}
Ways to improve this code so that the receiver doesn't end up with an unlimited number of unexpected messages include:
Synchronization - you mentioned MPI_Barrier, but even using MPI_Ssend instead of MPI_Send would work.
Explicit buffering - the use of MPI_Bsend or Brecv to ensure adequate buffering exists.
Posted receives - the receiving process posts IRecvs before starting work to ensure that the messages are received into the buffers meant to hold the data, rather than system buffers.
In this pedagogical case, since the number of messages is unlimited, only the first (synchronization) would reliably work.

Single-Sided communications with MPI-2

Consider the following fragment of OpenMP code which transfers private data between two threads using an intermediate shared variable
#pragma omp parallel shared(x) private(a,b)
{
...
a = somefunction(b);
if (omp_get_thread_num() == 0) {
x = a;
}
}
#pragma omp parallel shared(x) private(a,b)
{
if (omp_get_thread_num() == 1) {
a = x;
}
b = anotherfunction(a);
...
}
I would (in pseudocode ) need to transfer of private data from one process to another using a single-sided message-passing library.
Any ideas?
This is possible, but there's a lot more "scaffolding" involved -- after all, you are communicating data between potentially completely different computers.
The coordination for this sort of thing is done between windows of data which are accessible from other processors, and with lock/unlock operations which coordinate the access of this data. The locks aren't really locks in the sense of being mutexes, but they are more like synchronization points coordinating data access to the window.
I don't have time right now to explain this in the detail I'd like, but below is an example of using MPI2 to do something like shared memory flagging in a system that doesn't have shared memory:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include "mpi.h"
int main(int argc, char** argv)
{
int rank, size, *a, geta;
int x;
int ierr;
MPI_Win win;
const int RCVR=0;
const int SENDER=1;
ierr = MPI_Init(&argc, &argv);
ierr |= MPI_Comm_rank(MPI_COMM_WORLD, &rank);
ierr |= MPI_Comm_size(MPI_COMM_WORLD, &size);
if (ierr) {
fprintf(stderr,"Error initializing MPI library; failing.\n");
exit(-1);
}
if (rank == RCVR) {
MPI_Alloc_mem(sizeof(int), MPI_INFO_NULL, &a);
*a = 0;
} else {
a = NULL;
}
MPI_Win_create(a, 1, sizeof(int), MPI_INFO_NULL, MPI_COMM_WORLD, &win);
if (rank == SENDER) {
/* Lock recievers window */
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, RCVR, 0, win);
x = 5;
/* put 1 int (from &x) to 1 int rank RCVR, at address 0 in window "win"*/
MPI_Put(&x, 1, MPI_INT, RCVR, 0, 1, MPI_INT, win);
/* Unlock */
MPI_Win_unlock(0, win);
printf("%d: My job here is done.\n", rank);
}
if (rank == RCVR) {
for (;;) {
MPI_Win_lock(MPI_LOCK_EXCLUSIVE, RCVR, 0, win);
MPI_Get(&geta, 1, MPI_INT, RCVR, 0, 1, MPI_INT, win);
MPI_Win_unlock(0, win);
if (geta == 0) {
printf("%d: a still zero; sleeping.\n",rank);
sleep(2);
} else
break;
}
printf("%d: a now %d!\n",rank,geta);
printf("a = %d\n", *a);
MPI_Win_free(&win);
if (rank == RCVR) MPI_Free_mem(a);
MPI_Finalize();
return 0;
}

Resources