std::promise/std::future vs std::condition_variable in C++ - c++11

Signaling between threads can be achieved with std::promise/std::future or with good old condition variables. Can someone provide examples/use case where one would be a better choice over the other ?
I know that CVs could be used to signal multiple times between threads. Can you give example with std::future/promise to signal multiple times?
Also, is std::future::wait_for equivalent in performance with std::condition_variable::wait?
Let's say I need to wait on multiple futures in a queue as a consumer; does it make sense to go through each of them and check if they are ready like below ?
for(auto it = activeFutures.begin(); it!= activeFutures.end();) {
if(it->valid() && it->wait_for(std::chrono::milliseconds(1)) == std::future_status::ready) {
Printer::print(std::string("+++ Value " + std::to_string(it->get()->getBalance())));
activeFutures.erase(it);
} else {
++it;
}
}

can some one provide examples/use case where 1 would be a better
choice over other ?
These are 2 different tools of the standard library.
In order to give an example where 1 would be better over the other you'd have to come up with a scenario where both tools are a good fit.
However, these are different levels of abstractions to what they do and what they are good for.
from cppreference (emphasis mine):
Condition variables
A condition variable is a synchronization primitive that allows
multiple threads to communicate with each other. It allows some number
of threads to wait (possibly with a timeout) for notification from
another thread that they may proceed. A condition variable is always
associated with a mutex.
Futures
The standard library provides facilities to obtain values that are
returned and to catch exceptions that are thrown by asynchronous tasks
(i.e. functions launched in separate threads). These values are
communicated in a shared state, in which the asynchronous task may
write its return value or store an exception, and which may be
examined, waited for, and otherwise manipulated by other threads that
hold instances of std::future or std::shared_future that reference
As you can see, a condition variable is a synchronization primitive whereas a future is a facility used to communicate results of asynchronous tasks.
The condition variable can be used in a variety of scenarios where you need to synchronizes multiple threads, however you would typically use a std::future when you have tasks/jobs/work to do and you need it done without interrupting your main flow, aka asynchronously.
so in my opinion a good example where you would use a future + promise is when you need to run a long running calculation and get/wait_for the result at a later point of time. In comparison to a condition variable, where you would have had to basically implement std::future + std::promise on your own, possibly using std::condition_variable somewhere internally.
can you give example with std::future/promise to signal multiple times?
have a look at the toy example from shared_future
Also is std::future::wait_for equivalent in performance with std::condition_variable::wait?
well, GCC's implementation of std::future::wait_for uses std::condition_variable::wait_for which correlates with my explanation of the difference between the two. So as you can understand std::future::wait_for adds a very small performance overhead to std::condition_variable::wait_for

Related

How to terminate long running function after a timeout

So I a attempting to shut down a long running function if something takes too long, maybe is just a solution to treating the symptoms rather than cause, but in any case for my situation it didn't really worked out.
I did it like this:
func foo(abort <- chan struct{}) {
for {
select{
case <-abort:
return
default:
///long running code
}
}
}
And in separate function I have which after some time closes the passed chain, which it does, if I cut the body returns the function. However if there is some long running code, it does not affect the outcome it simply continues the work as if nothing has happened.
I am pretty new to GO, but it feels like it should work, but it does not. Is there anything I am missing. After all routers frameworks have timeout function, after which whatever is running is terminated. So maybe this is just out of curiosity, but I would really want how to od it.
your code only checks whether the channel was closed once per iteration, before executing the long running code. There's no opportunity to check the abort chan after the long running code starts, so it will run to completion.
You need to occasionally check whether to exit early in the body of the long running code, and this is more idiomatically accomplished using context.Context and WithTimeout for example: https://pkg.go.dev/context#example-WithTimeout
In your "long running code" you have to periodically check that abort channel.
The usual approach to implement that "periodically" is to split the code into chunks each of which completes in a reasonably short time frame (given that the system the process runs on is not overloaded).
After executing each such chunk you check whether the termination condition holds and then terminate execution if it is.
The idiomatic approach to perform such a check is "select with default":
select {
case <-channel:
// terminate processing
default:
}
Here, the default no-op branch is immediately taken if channel is not ready to be received from (or closed).
Some alogrithms make such chunking easier because they employ a loop where each iteration takes roughly the same time to execute.
If your algorithm is not like this, you'd have to chunk it manually; in this case, it's best to create a separate function (or a method) for each chunk.
Further points.
Consider using contexts: they provide a useful framework to solve the style of problems like the one you're solving.
What's better, the fact they can "inherit" one another allow one to easily implement two neat things:
You can combine various ways to cancel contexts: say, it's possible to create a context which is cancelled either when some timeout passes or explicitly by some other code.
They make it possible to create "cancellation trees" — when cancelling the root context propagates this signal to all the inheriting contexts — making them cancel what other goroutines are doing.
Sometimes, when people say "long-running code" they do not mean code actually crunching numbers on a CPU all that time, but rather the code which performs requests to slow entities — such as databases, HTTP servers etc, — in which case the code is not actually running but sleeping on the I/O to deliver some data to be processed.
If this is your case, note that all well-written Go packages (of course, this includes all the packages of the Go standard library which deal with networked services) accept contexts in those functions of their APIs which actually make calls to such slow entities, and this means that if you make your function to accept a context, you can (actually should) pass this context down the stack of calls where applicable — so that all the code you call can be cancelled in the same way as yours.
Further reading:
https://go.dev/blog/pipelines
https://blog.golang.org/advanced-go-concurrency-patterns

Run async function in specific thread

I would like to run specific long-running functions (which execute database queries) on a separate thread. However, let's assume that the underlying database engine only allows one connection at a time and the connection struct isn't Sync (I think at least the latter is true for diesel).
My solution would be to have a single separate thread (as opposed to a thread pool) where all the database-work happens and which runs as long as the main thread is alive.
I think I know how I would have to do this with passing messages over channels, but that requires quite some boilerplate code (e.g. explicitly sending the function arguments over the channel etc.).
Is there a more direct way of achieving something like this with rust (and possibly tokio and the new async/await notation that is in nightly)?
I'm hoping to do something along the lines of:
let handle = spawn_thread_with_runtime(...);
let future = run_on_thread!(handle, query_function, argument1, argument2);
where query_function would be a function that immediately returns a future and does the work on the other thread.
Rust nightly and external crates / macros would be ok.
If external crates are an option, I'd consider taking a look at actix, an Actor Framework for Rust.
This will let you spawn an Actor in a separate thread that effectively owns the connection to the DB. It can then listen for messages, execute work/queries based on those messages, and return either sync results or futures.
It takes care of most of the boilerplate for message passing, spawning, etc. at a higher level.
There's also a Diesel example in the actix documentation, which sounds quite close to the use case you had in mind.

MPI non-blocking communication and pthreads difference?

In MPI, there are non-blocking calls like MPI_Isend and MPI_Irecv.
If I am working on a p2p project, the Server would listen to many clients.
One way to do it:
for(int i = 1; i < highest_rank; i++){
MPI_Irecv(....,i,....statuses[i]); //listening to all slaves
}
while(true){
for( int i = 1; i < highest_rank; i++){
checkStatus(statuses[i])
if true do somthing
}
Another old way that I could do it is:
Server creating many POSIX threads, pass in a function,
that function will call MPI_Recv and loop forever.
Theoretically, which one would perform faster on the server end? If there is another better way to write the server, please let me know as well.
The latter solution does not seem very efficient to me because of all the overhead from managing the pthreads inside a MPI process.
Anyway I would rewrite you MPI code as:
for(int i = 1; i < highest_rank; i++){
MPI_Irev(....,i,....requests[i]); //listening to all slaves
}
while(true){
MPI_waitany(highest_rank, request[i], index, status);
//do something useful
}
Even better you can use MPI_Recv with MPI_ANY_SOURCE as the rank of the source of a message. It seems like your server does not have anything to do except serving request therefore there is no need to use an asynchronous recv.
Code would be:
while(true){
MPI_Recv(... ,MPI_ANY_SOURCE, REQUEST_TAG,MPI_comm,status)
//retrieve client id from status and do something
}
When calling MPI_Irecv, it is NOT safe to test the recv buffer until AFTER MPI_Test* or MPI_Wait* have been called and successfully completed. The behavior of directly testing the buffer without making those calls is implementation dependent (and ranges from not so bad to a segfault).
Setting up a 1:1 mapping with one MPI_Irecv for each remote rank can be made to work. Depending on the amount of data that is being sent, and the lifespan of that data once received, this approach may consume an unacceptable amount of system resources. Using MPI_Testany or MPI_Testall will likely provide the best balance between message processing and CPU load. If there is no non-MPI processing that needs to be done while waiting on incoming messages, MPI_Waitany or MPI_Waitall may be preferable.
If there are outstanding MPI_Irecv calls, but the application has reached the end of normal processing, it is "necessary" to MPI_Cancel those outstanding calls. Failing to do that may be caught in MPI_Finalize as an error.
A single MPI_Irecv (or just MPI_Recv, depending on how aggressive the message handling needs to be) on MPI_ANY_SOURCE also provides a reasonable solution. This approach can also be useful if the amount of data received is "large" and can be safely discarded after processing. Processing a single incoming buffer at a time can reduce the total system resources required, at the expense of serializing the processing.
Let me just comment on your idea to use POSIX threads (or whatever other threading mechanism there might be). Making MPI calls from multiple threads at the same time requires that the MPI implementation is initialised with the highest level of thread support of MPI_THREAD_MULTIPLE:
int provided;
MPI_Init_thread(&argv, &argc, MPI_THREAD_MULTIPLE, &provided);
if (provided != MPI_THREAD_MULTIPLE)
{
printf("Error: MPI does not provide full thread support!\n");
MPI_Abort(MPI_COMM_WORLD, 1);
}
Although the option to support concurrent calls from different threads was introduced in the MPI standard quite some time ago, there are still MPI implementations that struggle to provide fully working multithreaded support. MPI is all about writing portable, at least in theory, applications, but in this case real life differs badly from theory. For example, one of the most widely used open-source MPI implementation - Open MPI - still does not support native InfiniBand communication (InfiniBand is the very fast low latency fabric, used in most HPC clusters nowadays) when initialised at MPI_THREAD_MULTIPLE level and therefore switches to different, often much slower and with higher latency transports like TCP/IP over regular Ethernet or IP-over-InfiniBand. Also there are some supercomputer vendors, whose MPI implementations do not support MPI_THREAD_MULTIPLE at all, often because of the way the hardware works.
Besides, MPI_Recv is a blocking call which poses problems with proper thread cancellation (if necessary). You have to make sure that all threads escape the infinite loop somehow, e.g. by having each worker send a termination message with the appropriate tag or by some other protocol.

Inter-thread communication (worker threads)

I've created two threads A & B using CreateThread windows API. I'm trying to send the data from thread A to B.
I know I can use Event object and wait for the Event object in another using "WaitForSingleObject" method. What this event does all is just signal the thread. That's it! But how I can send a data. Also I don't want thread B to wait till thread A signals. It has it own job to do. I can't make it wait.
I can't find a Windows function that will allow me to send data to / from the worker thread and main thread referencing the worker thread either by thread ID or by the returned HANDLE. I do not want to introduce the MFC dependency in my project and would like to hear any suggestions as to how others would or have done in this situation. Thanks in advance for any help!
First of all, you should keep in mind that Windows provides a number of mechanisms to deal with threading for you: I/O Completion Ports, old thread pools and new thread pools. Depending on what you're doing any of them might be useful for your purposes.
As to "sending" data from one thread to another, you have a couple of choices. Windows message queues are thread-safe, and a a thread (even if it doesn't have a window) can have a message queue, which you can post messages to using PostThreadMessage.
I've also posted code for a thread-safe queue in another answer.
As far as having the thread continue executing, but take note when a change has happened, the typical method is to have it call WaitForSingleObject with a timeout value of 0, then check the return value -- if it's WAIT_OBJECT_0, the Event (or whatever) has been set, so it needs to take note of the change. If it's WAIT_TIMEOUT, there's been no change, and it can continue executing. Either way, WaitForSingleObject returns immediately.
Since the two threads are in the same process (at least that's what it sounds like), then it is not necessary to "send" data. They can share it (e.g., a simple global variable). You do need to synchronize access to it via either an event, semaphore, mutex, etc.
Depending on what you are doing, it can be very simple.
Thread1Func() {
Set some global data
Signal semaphore to indicate it is available
}
Thread2Func() {
WaitForSingleObject to check/wait if data is available
use the data
}
If you are concerned with minimizing Windows dependencies, and assuming you are coding in C++, then I recommend using Boost.Threads, which is a pretty nice, Posix-like C++ threading interface. This will give you easy portability between Windows and Linux.
If you go this route, then use a mutex to protect any data shared across threads, and a condition variable (combined with the mutex) to signal one thread from the other.
Don´t use a mutexes when only working in one single process, beacuse it has more overhead (since it is a system-wide defined object)... Place a critical section around Your data and try to enter it (as Jerry Coffin did in his code around for the thread safe queue).

data structures for scheduling workflow?

I'm wondering what kind(s) of data structures / algorithms might help facilitate handling the following situation; I'm not sure if I need a single FIFO, or a priority queue, or multiple FIFOs.
I have N objects that must proceed through a predefined workflow. Each object must complete step 1, then step 2, then step 3, then step 4, etc. Each step is either done quickly or involves a "wait" that depends on something external to finish (like the completion of a file operation or whatever). Each object maintains its own state. If I had to define an interface for these objects, it would be something like this (written below in pseudo-Java, but this question is language-agnostic):
public interface TaskObject
{
public enum State { READY, WAITING, DONE };
// READY = ready to execute next step
// WAITING = awaiting some external condition
// DONE = finished all steps
public int getCurrentStep();
// returns # of current step
public int getEndStep();
// returns # of step which is the DONE case.
public State getState();
// checks state and returns it.
// multiple calls will always be identical,
// except WAITING which can transition to READY or DONE.
public State executeStep();
// if READY, executes next step and returns getState().
// otherwise, returns getState().
}
I need to write a single-threaded scheduler that calls executeStep() on the "next" object. My problem is, I'm not sure exactly what technique I should use to determine what the "next" object is. I want it to be fair (first-come, first-serve for objects not in the WAITING state).
My gut call is to have 3 FIFOs, READY, WAITING and DONE. In the beginning all objects are placed in the READY queue, and the scheduler repeats a loop where it takes the first object off the READY queue, calls executeStep(), and places it onto the queue that's appropriate the the result of executeStep(). Except that items in the WAITING queue need to be put into the READY or DONE queue when their state changes.... argh!
Any advice?
If this has to be single threaded you can use a single FIFO queue for the ready and waiting objects and use your thread to process each object as it comes out. If it's state changes to WAITING then simply stick it back into the queue and it will be reprocessed.
Something like (psuedocode):
var item = queue.getNextItem();
var state = item.executeStep ();
if (state == WAITING)
queue.AddItem (item);
else if (state == DONE)
// add to collection of done objects
Depending on the time executeStep takes to run you may need to introduce a delay (Sleep not for) to prevent a tight polling loop. Ideally you would have the objects publish state change events and do-away with the polling altogether.
This is the kind of timeslicing approach that was commonplace in hardware and comms software before multithreading was widespread.
You don't have any way for the task object to notify you when it changes from WAITING to READY except polling it, so the WAITING and READY queues could really just be one. You can just loop around it calling executeStep() on each one in turn. If as a return value from executeStep() you receive DONE, then you remove it from that queue and stick it on the DONE queue and forget about it.
If you wanted to give "more priority" towards READY objects and attempt to run through all possible READY objects before wasting any resources polling WAITING you can maintain 3 queues like you said and only process the WAITING queue when you have nothing in the READY queue.
I personally would spend some effort to eliminate the polling of the state, and instead define an interface that the object could use to notify your scheduler when a state changes.
You might want to study the design of an operating system scheduler. Check out the Linux and *BSD for example.
Some pointers for the Linux scheduler: Inside the Linux scheduler and Understanding the Linux Kernel
NOTE - this does not address your question of how to schedule, but I would use a separate state class that defines the states and transitions. The objects should not know what states they should go through. They can be informed of what "Step" they are at, etc.
there are some patterns for that as well.
You should read up a little on operating systems - specifically the scheduler. Your example is a scaled down set of that problem and if you copy the relevant parts it should work great for you.
You can then add priority, etc.
The simplest technique that satisfies the requirements in your question is to repeatedly iterate over all TaskObjects calling executeStep() on each one.
This requires only one construct to hold the TaskObjects, and it can be any iterable structure, e.g. an array.
Since a TaskObject can transition from WAITING to READY asynchronously, you have to poll every TaskObject that you don't know is DONE.
The performance gained from not polling the DONE TaskObjects may be negligible. It depends on the processing load of calling executeStep() on a DONE TaskObject, which should be small.
A simple round-robin polling assures that once a READY TaskObject has executed a step, it will not execute another step until all other TaskObjects have had a chance to execute.
One obvious additional requirement is detecting when all TaskObjects are in the DONE state so you can stop processing.
To avoid polling DONE TaskObjects you will need to either maintain a flag for each one, or chain the TaskObjects in two queues: READY/WAITING and DONE.
If you store the TaskObjects in an array, make it an array of records, with members DoneFlag and TaskObject.
If for some reason you are storing the TaskObjects in a queue, with available enqueue() and dequeue() methods, then the overhead of two queues instead of one may be small.
-Al.
Take a look a this link.
Boost state machines vs uml
Boost has state machines. Why reinvent?

Resources