Spin off one statement in a loop parallel OpenMP - visual-studio

I have a large C++ project(Windows VC++ 13) in which I have a costly loop. The loop is costly because one statement(which is another function call) is very expensive. I want to make that one line parallel. What I mean is, I dont want each to wait to completion. On coming to that line, I want to spin it off as a separate thread. End of the iteration I want all the threads to join and then I want to continue my execution. Is that possible? A simplified version of the problem I am solving is,
for(int i =0 ; i<limit; i++)
{
//some processing and getting some value into a variable say x
<value to use> += <costly function which taken in x as parameter>
//some more processing
}
What I would like to do is, when it hits the line that calls the costly function, I would like it to spin off a thread. And in the end of the loop I would wait to join all the threads and continue execution. I obviously must not do a parallel for in openMP for the loop since that screws up the calculation of x and sends the same x to multiple calls of the function. Also I would prefer using OpenMP. Can somebody help with this please?

Related

Caller/Backtrace beyond a thread

As far as I know, it is possible to get only the portion of the caller/backtrace information that is within the current thread; anything prior to that (in the thread that created the current thread) is cut off. The following exemplifies this; the fact that a called b, which called c, which created the thread that called d, is cut off:
def a; b end
def b; c end
def c; Thread.new{d}.join end
def d; e end
def e; puts caller end
a
# => this_file:4:in `d'
# this_file:3:in `block in c'
What is the reason for this feature?
Is there a way to get the caller/backtrace information beyond the current thread?
I think I came up with my answer.
Things that can be done to a thread from outside of a thread is not only creating it. Other than creating, you can make wake up, etc. So it is not clear what operation should be attributed as part of the caller. For example, suppose there is a thread:
1: t = Thread.new{
2: Thread.stop
3: puts caller
4: }
5: t.wakeup
The thread t is created at line 1, but it goes into sleep by itself in line 2, then it wakes up by line 5. So, when we locate ourselves at line 3 caller, and consider the caller part outside of the thread, it is not clear whether Thread.new in line 1 should be part of it, or t.wakeup in line 5 should be part of it. Therefore, there is no clear notion callers beyond the current thread.
However, if we define a clear notion, then it is possible for caller beyond a thread to make sense. For example, always adding the callers up to the creation of the thread may make sense. Otherwise, adding the callers leading to the the most recent wakeup or creation may make sense. It is up to the definition.
The answer to both your questions is really the same. Consider a slightly more involved main thread. Instead of simply waiting for the spawned thread to end in c the main thread goes on calling other functions, perhaps even returning from c and going about it's business while the spawned thread goes on about it's business.
This means that the stack in the main thread has changed since the thread starting in d was spawned. In other words, by the time you call puts caller the stack in the main thread is no longer in the state it was when the secondary thread was created. There is no way to safely walk back up the stack beyond this point.
So in short:
The stack of the spawning thread will not remain in the state it was when the thread was spawned so walking back beyond the start of a threads own stack is not safe.
No, since the entire idea behind threads is that they are (pseudo) parallel, their stacks are completely unrelated.
Update:
As suggested in the comments, the stack of the current thread can be copied to the new thread at creation time. This would preserve the information that lead up to the thread being created, but the solution is not without its own set of problems.
Thread creation will be slower. That could be ok, if there was anything to gain from it, but in this case, is it?
What would it mean to return from the thread entry function?
It could return to the function that created the thread and keep running as if it was just a function call - only that it now runs in the second thread, not the original one. Would we want that?
There could be some magic that ensures that the thread terminates even if it's not at the top of the call stack. This would make the information in the call stack above the thread entry function incorrect anyways.
On systems with limits on the stacksize for each thread you could run into problems where the thread ran out of stack even if it's not using very much on it's own.
There probably other scenarios and peculiarities that could be thought out too, but the way threads are created with their own empty stack to start with makes the model both simple and predictable without leaving any useful information out of the callstack.

EM.next_tick with recursive usage

# Spawn workers to consume items from the iterator's enumerator based on the current concurrency level.
def spawn_workers
EM.next_tick(start_worker = proc{
if #workers < #concurrency and !#ended
# p [:spawning_worker, :workers=, #workers, :concurrency=, #concurrency, :ended=, #ended]
#workers += 1
#process_next.call
EM.next_tick(start_worker)
end
})
nil
end
I read this part of code from EM interator which is used by EM-sychrony#fiberd_interator.
I have some basic idea of Eventmachin, but I'm not very clear about this kind of recursive usage of next_tick, could any one give me a explaination about this plz?
In my opinion, it's just like a loop while it is handled by EM, not "while" or "for". Am I right? And why this?
It's not really a recursive call, think of it as "scheduling a proc to happen a moment later",
EventMachine is basically an endless loop that does stuff scheduled to happen in the next iteration of the loop (next tick),
Imagine next_tick method as a command queueing mechanism,
spawn_workers method schedules the start_worker proc to happen on the next iteration of the event loop.
In the next EM loop iteration start_worker proc will be ran and a #process_next.call will happen which I assume spawns the worker and thus it happens that the first worker is instantiated, the command
EM.next_tick(start_worker)
schedules the same block to happen in next iteration of the EM loop until all workers are spawned.
This means that, for example, if 8 workers need to be instantiated, one worker at a time will be spawned in next 8 ticks of the event loop

Efficient daemon in Vala

i'd like to make a daemon in Vala which only executes a task every X seconds.
I was wondering which would be the best way:
Thread.usleep() or Posix.sleep()
GLib.MainLoop + GLib.Timeout
other?
I don't want it to eat too many resources when it's doing nothing..
If you spend your time sleeping in a system call, there's won't be any appreciable difference from a performance perspective. That said, it probably makes sense to use the MainLoop approach for two reasons:
You're going to need to setup signal handlers so that your daemon can die instantaneously when it is given SIGTERM. If you call quit on your main loop by binding SIGTERM via Posix.signal, that's probably going to be a more readable piece of code than checking that the sleep was successful.
If you ever decide to add complexity, the MainLoop will make it more straight forward.
You can use GLib.Timeout.add_seconds the following way:
Timeout.add_seconds (5000, () => {
/* Do what you want here */
// Continue this "loop" every 5000 ms
return Source.CONTINUE;
// Or remove it
return Source.REMOVE;
}, Priority.LOW);
Note: The Timeout is set as Priority.LOW as it runs in background and should give priority to others tasks.

Can I use MPI_Barrier() to synchronize data in-between iteration steps

Is it good idea to use MPI_Barrier() to synchronize data in-between iteration steps. Please see below pseudo code.
While(numberIterations< MaxIterations)
{
MPI_Iprobe() -- check for incoming data
while(flagprobe !=0)
{
MPI_Recv() -- receive data
MPI_Iprobe() -- loop if more data
}
updateData() -- update myData
for(i=0;i<N;i++) MPI_Bsend_init(request[i]) -- setup request
for(i=0;i<N;i++) MPI_Start(request[i]) -- send data to all other N processors
if(numberIterations = MaxIterations/2)
MPI_Barrier() -- wait for all processors -- CAN I DO THIS
numberIterations ++
}
Barriers should only be used if the correctness of the program depends on it. From your pseudocode, I can't tell if that's the case, but one barrier halfway through a loop looks very suspect.
Your code will deadlock, with or without a barrier. You receive in every rank before sending any data, so none of the ranks will ever get to a send call. Most applications will have a call such as MPI_Allreduce instead of a barrier after each iteration so all ranks can decide whether an error level is small enough, a task queue is empty, etc. and thus decide whether to terminate.
In this article http://static.msi.umn.edu/rreports/2008/87.pdf it says that you have to call MPI_Free_request() before MPI_Bsend_init().

What is the best way to periodically export a counter from a loop in Ruby

I have created a daemon in Ruby which has a counter incrementing inside of a loop. The loop does its business, then sleeps for 1 second, then continues. Simplified it's something like:
loop do
response = send_command
if response == 1
counter += 1
end
sleep(1)
end
Every 5 minutes I would like to call a method to database the counter value. I figure there are a few ways to do this. The initial way I considered was calling Time.now in the loop, examining it to match 5 minutes, 0 seconds, and if that matched, call the sql function. That seems terribly inefficient, however, and it could also miss a record if send_command took some time.
Another possibility may be to make available the counter variable, which could be called (and reset) via a socket. I briefly took a look at the Socket class, and that seems possible.
Is there an obvious/best way to do this that I'm missing?
If you just want to save every 5 minutes, you could just use a Thread. Something like:
Thread.new do
save_value_in_the_db(counter)
sleep 5*60
end
Note that the thread have access to counter if it is defined in the same block as the loop. you could also use an object and have the #counter declared insidd.
If you prefer to access remotely, you can do it with a socket or use a drb approach, that is probably easier. This drb tutorial seem to fit your requirements: http://ruby.about.com/od/advancedruby/a/drb.htm
I'd have the counter be updated every time through the loop, then periodically have something read that and update the database.
That makes a simpler main loop because it doesn't have to pay attention to how long it's needed to wait before exporting the value.
And, it's very common and normal to have a periodic task that samples a value and does something with it.
Creating a simple socket would work well. Ruby's Socket code RDoc has some samples for echo servers that might get you started.

Resources