MPI prime numbers [closed] - parallel-processing

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I wanted to find paralel algorithm which finds prime numbers with mpi library.I found this one but when i run on code block,always i get
Sorry - this exercise requires an even number of tasks.
evenly divisible into 2500000 . Try 4 or 8.
What it means?how can i obtain number of tasks.
https://computing.llnl.gov/tutorials/mpi/samples/C/mpi_prime.c

What it means?
It means that you probably have to take a look at the source code and try to understand how it works. High Performance Mark has already pointed to the right MPI call and if you look at the beginning of the main function, you'd see these lines:
MPI_Comm_size(MPI_COMM_WORLD,&ntasks);
if (((ntasks%2) !=0) || ((LIMIT%ntasks) !=0)) {
printf("Sorry - this exercise requires an even number of tasks.\n");
printf("evenly divisible into %d. Try 4 or 8.\n",LIMIT);
MPI_Finalize();
exit(0);
}
Obviously it requires an even number of MPI processes (otherwise ntasks%2 != 0) and this number should also divide LIMIT (which is equal to 2500000 in this case). MPI programs should be executed through the MPI launcher, which in most cases is called mpiexec or mpirun. It takes the number of processes as a parameter. If you do not run the code through mpiexec, most MPI implementations behave as if the program was started using
mpiexec -np 1 ./program
1 is not even, hence the first part of the if condition evaluates to true and the abort code gets executed.
What you should do is either run the program in a terminal using mpiexec -np <# of procs> executable, where <# of procs> is the desired number of MPI processes and executable is the name of the executable file. <# of procs> should be even and should divide 2500000. I would suggest to go with 2, 4 or 8. 10 would also do. You won't see any improvement in the speed unless your development system has multicore CPU or/and several CPUs.
You mention Code::Blocks. See here on some ideas of how to make it run MPI programs through mpiexec.

The usual way to get the number of processes during the execution of an MPI program is to call the MPI_COMM_SIZE subroutine, like this
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
where num_procs is an integer which will equal the number of processes after the call has completed. I expect that what you call a task is the same as what I call a process.
Note that I've written the call in Fortran, C and C++ bindings are also available though the latter seem to be going out of favour.

Related

Prove that we can decide whether a Turing machine takes at least 100 steps on some input

We know that the problem “Does this Turing machine take at least this finite number of steps on that input?” is decidable, because it will always answer yes or no, where it will say yes if the machine reaches the given number of steps and no if it halts before that.
Now here is my doubt: if it halts before reaching those many steps — i.e. the input either (1) got accepted or (2) got rejected or maybe (3)if it doesn’t halt but rather goes into an infinite loop — then, when we are in case (3), how can we be sure that it will always be in that loop?
What I mean to say is that if it doesn't run forever but comes out of the loop at some point of time then it might cross the asked number of steps and the decision can be made now which was earlier not possible. If so, then how can we conclude that it's decidable when we know that being stuck in a loop we won’t be able to say anything about the outcome?
(I already more or less answered your question when I edited it.)
The thing is, the decision system (a Turing machine, an algorithm or any other equivalent formalism) that takes as inputs a Turing machine M, a number N and a value X, and returns yes or no, has total control over how it executes M on X. It simulates it step by step. So it can run one step of M(X), increment an instruction counter, compare it to N and, as soon as the given number of steps is reached, it stops and returns yes. At that point, there is no need that the simulated machine M be in a final state, and actually the full computation M(X) could very well diverge. We don’t care, because we only run the first N steps.
Most likely the "conditional structures where not being debuged/developed enough so that multiple conditions often conflicted each other..the error reporting where not as definitive, so it where used semi abstract notions as "decidable" and "undecidable"
as a semi example i writen years ago in vbs a "64 bit rom memory" simulator, as i tried to manage the memory cells, where i/o read/write locations where atributed , using manny formulas and conditions to set conversions from decimal to binary and all the operations, indexing, etc.
I had allso run into bugs becouse that the conditons where not perfect.Why? becouse the program had some unresolved somewhat arbitrary results that could had ended up in :
print.debug "decidable"
On Error Resume h
h:
print.debug "undecidable"
this was a example with a clear scope and with a debatable result.
to resume to your question : > "so how do we conclude that it's decidable??"
wikipedia :
The Turing machine was invented in 1936 by Alan Turing, who called it an "a-machine" (automatic machine). With this model, Turing was able to answer two questions in the negative:
Does a machine exist that can determine whether any arbitrary machine on its tape is "circular" (e.g., freezes, or fails to continue its computational task)?
Does a machine exist that can determine whether any arbitrary machine on its tape ever prints a given symbol?
Thus by providing a mathematical description of a very simple device capable of arbitrary computations, he was able to prove properties of computation in general—and in particular, the uncomputability of the ('decision problem').

Increasing time slices for a particular process via implementing a system call in xv6 [duplicate]

This question already has answers here:
How to modify process preemption policies (like RR time-slices) in XV6?
(2 answers)
Closed 2 years ago.
I am trying to implement a system call in xv6 OS Increase_time( int n) which when executed will increase the timeslice of a program that calls it by n times. The default xv6 scheduler uses a simple FCFS and RR policies with each process having the same time slice. My implementation of Increase_time() will allow different processes to have different amount of time slices.
Can you please tell me a way how I can work around this?
I know how to add a system call in xv6. I just need an idea as to how I can code my system call and what files to change in xv6.
It seems to me that you are asking this question.
In short : don't change amount of time in a slice according to the process, rather change number of timeslices received by it (refer the linked post).

Running N Iterations of a Single-Processor Job in Parallel

There should be a simple solution, but I am too novice with parallel processing.
I want to run N instances of command f in different directories. There aren't different parameters for f or anything like that. There are no parameters or anything for f. It just runs based on an input file in the directory where it is started. I would like to just run one instance of the function in each of the N directories.
I have access to 7 nodes which have a total of ~280 processors between them, but I'm not familiar enough with mpi things to know how to code for the above.
I do know that I can use mpirun and mpiexec, if that helps at all...
Help?

Can Ruby threads not collide on a write?

From past work in C# and Java, I am accustomed to a statement such as this not being thread-safe:
x += y;
However, I have not been able to observe any collision among threads when running the above code in parallel with Ruby.
I have read that Ruby automatically prevents multiple threads from writing to the same data concurrently. Is this true? Is the += operator therefore thread-safe in Ruby?
Well, it depends on your implementation and a lot of things. In MRI, there is a such thing as the GVL (Giant VM Lock) which controls which thread is actually executing code at a time. You see, in MRI, only one thread can execute Ruby code at a time. So while the C librarys underneath can let another thread run while they use CPU in C code to multiply giant numbers, the code itself can't execute at the same time. That means, a statement such as the assignment might not run at the same time as another one of the assignments (though the additions may run in parallel). The other thing that could be happening is this: I think I heard that assignments to ints are atomic on Linux, so if you're on Linux, that might be something too.
x += 1
is equivalent in every way to
x = x + 1
(if you re-define +, you also automatically redefine the result of +=)
In this notation, it's clearer this is not an atomic operation, and is therefore not guaranteed thread-safe.

cost of == operator vs < or > operators [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
This is really just sort of an academic question, I'm just curious to know which one is faster. I'm guessing the difference is negligible but still, I'd like to know.
if( (x == 1) || (x == 2) )
vs
if (x < 3)
thanks!
In the form you provided there is evident difference in complexity: the first code uses 3 operators, then the second -- just one. But OK, lets put this code of and assume you want to compare > (or <) and == (!=). If you have ever faced assembler while examing your programs (but i bet you didn't) you would notice such code
if (x < 3) /* action */;
being translated to smth like (for x86 CPU):
cmp %eax, 3 // <-- compare EAX and 3. It modifies flags register(*1) (namely ZF, CF, SF and so on)
jge ##skip // <-- this instruction tells CPU to jump when some conditions related to flags are met(*2).
// So this code is executed when jump is *not* made (when x is *not*
// greater or equal than 3.
/* here is action body */
##skip:
// ...
Now consider this code:
if (x == 3) /* action */;
It will give almost the same assembly (of course, it may differ from mine, but not semantically):
cmp %eax, 3
jne ##skip // < -- here is the only difference
/* action is run when x is equal to 3 */
##skip:
...
Both of this operators (jge and jne and others jumps) do their job with the same speed (because CPUs are made so, but it obviously depends on its architecture). The more impact on performance does distance of jump (difference between code positions), cache misses (when processor wrongly predicted the jump), and so on. Some instructions are more effective even (use less bytes for example), but remember the only thing: do not pay them so much attention. It always better to make algorythmic optimizations, don't do savings on matches. Let the compiler do it for you -- it is really more competent in such questions. Focus on your algorythms, code readablity, program architecture, fault-tolerance.. Make speed the last factor.
(*1): http://en.wikipedia.org/wiki/FLAGS_register_%28computing%29
(*2): http://www.unixwiz.net/techtips/x86-jumps.html

Resources