OpenMP - parallelize a loop - parallel-processing

I have been trying to parallelize with this loop with OpenMP
#define AX(i,j,k) (Ax[((k)*n+(j))*n+(i)])
for (int k = k1; k < k2; ++k) {
for (int j = j1; j < j2; ++j) {
for (int i = i1; i < i2; ++i) {
double xx = AX(i,j,k);
double xn = (i > 0) ? AX(i-1,j,k) : 0;
double xe = (j > 0) ? AX(i,j-1,k) : 0;
double xu = (k > 0) ? AX(i,j,k-1) : 0;
AX(i,j,k) = (xx+xn+xe+xu)/6*w;
}
}
}
#undef AX
I put this at the top of this code:
#pragma omp parallel for private (k,j,i) shared(Ax)
I noticed, however, that the #pragma is not working, since my function is simultaneously faster but generates more inconsistent results (probably due to data dependencies).
I probably have to put another clause or try to change something in the code, but I don't have any idea as to what.
EDIT :
Okay thank you I understand why it is not working but I tried why you said, and unfortunetaly it is still not working. Yet, I know the problem but I don't know how to solve it.
void ssor_forward_sweep(int n, int i1, int i2, int j1, int j2, int k1, int k2,
double* restrict Ax, double w)
{
int k,j,i;
double* AxL=malloc(n*sizeof(double));
for (int a=0; a < n;a++){
AxL[a]=Ax[a];
}
#define AX(i,j,k) (Ax[((k)*n+(j))*n+(i)])
#define AXL(i,j,k) (AxL[((k)*n+(j))*n+(i)])
#pragma omp parallel for private (k,j,i) shared(Ax)
for (k = k1; k < k2; ++k) {
for (j = j1; j < j2; ++j) {
for (i = i1; i < i2; ++i) {
double xx = AXL(i,j,k);
double xn = (i > 0) ? AXL(i-1,j,k) : 0;
double xe = (j > 0) ? AXL(i,j-1,k) : 0;
double xu = (k > 0) ? AXL(i,j,k-1) : 0;
AX(i,j,k) = (xx+xn+xe+xu)/6*w;
//AXL(i,j,k) = (xx+xn+xe+xu)/6*w;
}
}
}
#undef AX
#undef AXL
I know that there is still a problem with data dependencies but I don't know how to solve it ; indeed modified values aren't taking in account for the new ones. It also may have a problem when I am copying data.
When I am saying it is not working I don't have any output (no error and no output), it is just directly crashing.
Hope someone can help me !
Thank you so much for the help !
Best regards,

Related

OpenMP if statement in for loop

I try to use openmp parallelize my code. However, the code did't speed up. And it was 10 times slower.
code:
N=10000;
int i, count=0,d;
double x, y;
#pragma omp parallel for shared(N) private(i,x,y) reduction(+:count)
for( i = 0; i < N; i++ ){
x = rand()/((double)RAND_MAX+1);
y = rand()/((double)RAND_MAX+1);
if(x*x + y*y < 1){
++count;
}
}
double pi= 4.0 * count / N;
I think it was because of the if statement?
thanks for any help!!

Coin making problem in DP - getting wrong answer using 2dimensional memo table

When I am passing this Input I am getting wrong answer
coin[] = {5,6}
Amount(W) = 10
my answer = 1
Correct Answer should be 2 i.e {5,5}
void coin_make(int W, vector<int> coin){
int n = coin.size();
int dp[n+1][W+1];
for(int i = 0; i <=W; i++){
dp[0][i] = INT_MAX;
}
for(int i = 1; i <= n; i++){
for(int j = 1; j <= W; j++){
if(coin[i-1] == j){
dp[i][j] = 1;
}
else if(coin[i-1] > j){
dp[i][j] = dp[i-1][j];
}
else {
dp[i][j] = min(dp[i-1][j],
1 + dp[i][j-coin[i-1]]);
}
}
}
cout<<dp[n][W];}
You're overflowing on dp[1][6], since you try to calculate 1 + INT_MAX. This error propagates further and finally the answer is not correct. When I ran it on my machine, I got -2147483648. You should use some other constant as "infinity" to prevent overflows (e.g. 2e9 (or -1, but this would require some additional if statements)). Then the code will work fine on your provided test case.

what's wrong in this heapifying algorithm

i'm not sure what wrong with my code and its not converting the array into heap.please help!!!
pointer a is the pointer to the array passing to the function(you must have figured out that by now) and z is the length of the array.
please do explain me why i'm wrong.
i'm noob at coding(you must have figured out that also by my code for sure).
thank you for your precious time.
int heapy(int *a,int z)
{
for(i = 0; i<z ;i++)
{ c[i] = a[i];
for(j = i; j >= 0; --j)
{ y = (j-1)/2;
if(c[j] > c[y])
{ temp = c[y];
c[y] = c[j];
c[j] = temp;
j = y;}
else
break;
}
}
}
First point: You don't need the loop over j and that is where you have your problem. That is true, that you should assign y value to j, but just after that you decrement j in loop, so finally you get y - 1.
What you should do is either just change line j = y; to j = y + 1, or change the loop to
y = (j - 1) / 2
while (c[j] > c[y]){
temp = c[y];
c[y] = c[j];
c[j] = temp;
j = y;
y = (j - 1) / 2;
}
Second point: please do not compress your code like this. New line after bracket is much more readable.
EDIT:
Full implementation in C++ looks like this:
int heapy(int *a, int *c, int z)
{
for (int i = 0; i < z; i++){
c[i] = a[i];
int j = i;
int y = (j - 1) / 2;
while(c[j] > c[y]){
int temp = c[y];
c[y] = c[j];
c[j] = temp;
j = y;
y = (j - 1) / 2;
}
}
}
If array of i elements is a heap than you should add element on its end and swap it with it's parents as long as they are less than it.
In short: your program is too long by three characters: just remove --j from it.

openmp, for loop parallelization and critical zone error

I am new to OpenMP and I am using it to implement the Sieve of Eratosthenes, My code are:
int check_eratothenes(int *p, int pn, int n)
{
int count = 0;
bool* out = new bool[int(pow(pn, 2))];
memset(out, 0, pow(pn, 2));
#pragma omp parallel
for (int i = 0; i < n; i ++)
{
int j = floor((pn + 1) / p[i]) * p[i];
#pragma omp critical
while (j <= pow(pn, 2))
{
out[j] = 1;
j += p[i];
}
}
#pragma omp parallel
for (int i = pn+1; i < pow(pn, 2); i ++)
{
#pragma omp critical
if (out[i] == 0)
{
//cout << i << " ";
count ++;
}
}
return count;
}
But, the above OpenMP pragma is wrong. It can be complied but when it runs, it takes a lot of time to get the result, so it press CTRL + C to stop. And I felt at a loss on how to solve it . Since there are many loops and if statements.
Thanks in advance.

Broken Merge Sort

Good morning, Stack Overflow. You guys helped me out on an earlier assignment, and I'm hoping to get a little help on this one.
It's a programming assignment relating to sorts, one part of which is to write a working implementation of merge sort.
I adapted my solution from the pseudocode the professor used in class, but I'm getting an annoying segfault at the indicated location.
This method is sorting an array of structs, with data_t defined as struct pointers.
The struct definition:
typedef struct {
int id;
int salary;
} employee_t;
typedef employee_t* data_t;
They're being sorted by salary, which is a randomly generated number from 40,000 to 90,000.
Here's the actual method
void merge_sort(data_t items[], size_t n)
{
if (n < 2)
return;
size_t mid = (n / 2);
data_t *left = malloc(sizeof(data_t) * mid);
data_t *right = malloc(sizeof(data_t) * (n - mid));
for (int y = 0; y < mid; y++)
{
left[y] = items[y];
}
for (int z = mid; z < n; z++)
{
right[z] = items[z];
}
merge_sort(left, mid);
merge_sort(right, (n - mid));
size_t l, r, i;
l = 0;
r = 0;
for (i = 0; i < (n - 1); i++)
{
if ((l < mid) && ((r >= (n - mid)) || ((left[l]->salary) <= (right[r]->salary))))
{
items[i] = left[l++];
}
else
{
items[i] = right[r++];
}
}
free(left);
free(right);
}
Note that I haven't made it as far as the end, so the array frees might be incorrectly located.
The segfault always occurs when I try to access right[r]->salary, so I'm assuming this is related to a null pointer, or similar. However, I'm extremely new to sorting, and I don't know exactly where to properly implement a check.
Any advice is appreciated greatly.
At first glance there's this fix:
for (int z = mid; z < n; z++)
{
right[z-mid] = items[z];
}

Resources