Correct usage of boost::asio for a multi-client process - boost

I'm trying to use boost::asio for the first time to write a process that connects to N servers reads data from them.
My question regards the way in which asynchronicity works. My design goal is to connect to all servers in parallel, and also read data from every server in parallel. This should be done with async_connect and async_read, and calling io_service::run() N times, then reading the results. And the question is: is it enough to call io_service::run() from a single thread, sequentially, N times, in order to achieve parallelism?
Note that this is a matter of the implementation of asio: specifically, when calling connect_async and write_async, does the call signal the OS to begin connecting/reading before returning, or does it simply delegate a synchronous connect/read task to the worker thread and returns immediately? - case in which calling io_service::run() from a single thread means serial execution of tasks.
My guess is the former, of course, but I need someone to please confirm. I find it off that the documentation for async stuff (http://think-async.com/Asio/boost_asio_1_3_1/doc/html/boost_asio/overview/core/basics.html) doesn't mention when the async_xxx calls return, which would clarify my question.

The heart of asio is an event loop, which begins with the call to io_service::run(), which is a blocking call. When you call async_connect, you queue up the connect operation in the io_services event queue. To achieve parallelism, you must create a thread pool and have each thread call run() on the same io_service instance.

Related

Is it possible to share the same message queue between two threads?

What I would like to do is to have one thread waiting for messages (WaitMessage) and another processing the logic of the application. The first thread would wake up on every message, signal somehow this event to the other thread, go to sleep again, etc. Is this possible?
UPDATE
Consider the following situation. We have a GUI thread, and this thread is busy in a long calculation. If there is no other thread, there is no option but to check for new messages from time to time. Otherwise, the GUI would become irresponsive during the long calculation. Right now my system uses this "polling" approach (it has a single thread that checks the message queue from time to time.) However, I would like to know whether this other solution is possible: Have another thread waiting on the OS message queue of the GUI so that when a Windows message arrives this thread will wake up and tell the other about the message. Note that I'm not asking how to communicate the news between threads but whether it is possible for the second thread to wait for OS messages that arrive in the queue of the first thread.
I should also add that I cannot have two different threads, one for the GUI and another for the calculations, because the system I'm working on is a Virtual Machine on top of which runs a Smalltalk image that is not thread safe. That's why having a thread that only signals new OS messages would be the ideal solution (if possible.)
This depends on what the second thread needs to do once the first thread has received a message.
If the second thread simply needs to know the first thread received a message, the first thread could signal an Event object using SetEvent() or PulseEvent(), and the second thread could wait on that event using WaitForSingleObject().
If the second thread needs data from the first thread, it could use an I/O Completion Port. The first thread could wrap the data inside a dynamically allocated struct and post it to the port using PostQueuedCompletionStatus(), and the second thread could wait for the data using GetQueuedCompletionStatus() and then free it when done using it.
Update: based on new information you have provided, it is not possible for one thread to wait on or service another thread's message queue. Only the thread that created and owns the queue can poll messages from its queue. Each thread has its own message queue.
You really need to move your long calculations to a different thread, they don't belong in the GUI thread to begin with. Let the GUI thread manage the GUI and service messages, do any long-running things in another thread.
If you can't do that because your chosen library is not thread safe, then you have 4 options:
find a different library that is thread safe.
have the calculations poll the message queue periodically when running in the GUI thread.
break up the calculations into small chunks that can be triggered by the GUI thread posting messages to itself. Post a message and return to the message loop. When the message is received, do a little bit of work, post the next message, and return to the message loop. Repeat as needed until the work is done. This allows the GUI thread to continue servicing the message queue in between each calculation step.
move the library to a separate process that communicates back with your main app as needed.

How to force GetQueuedCompletionStatus() to return immediately?

I have hand-made thread pool. Threads read from completion port and do some other stuff. One particular thread has to be ended. How to interrupt it's waiting if it hangs on GetQueuedCompletionStatus() or GetQueuedCompletionStatusEx()?
Finite timeout (100-1000 ms) and exiting variable are far from elegant, cause delays and left as last resort.
CancelIo(completionPortHandle) within APC in target thread causes ERROR_INVALID_HANDLE.
CancelSynchronousIo(completionPortHandle) causes ERROR_NOT_FOUND.
PostQueuedCompletionStatus() with termination packet doesn't allow to choose thread.
Rough TerminateThread() with mutex should work. (I haven't tested it.) But is it ideologically good?
I tried to wait on special event and completion port. WaitForMultipleObjects() returned immediately as if completion port was signalled. GetQueuedCompletionStatus() shows didn't return anything.
I read Overlapped I/O: How to wake a thread on a completion port event or a normal event? and googled a lot.
Probably, the problem itself – ending thread's work – is sign of bad design and all my threads should be equal and compounded into normal thread pool. In this case, PostQueuedCompletionStatus() approach should work. (Although I have doubts that this approach is beautiful and laconic especially if threads use GetQueuedCompletionStatusEx() to get multiple packets at once.)
If you just want to reduce the size of the thread pool it doesn't matter which thread exits.
However if for some reason you need to signal to an particular thread that it needs to exit, rather than allowing any thread to exit, you can use this method.
If you use GetQueuedCompletionStatusEx you can do an alertable wait, by passing TRUE for fAlertable. You can then use QueueUserAPC to queue an APC to the thread you want to quit.
https://msdn.microsoft.com/en-us/library/windows/desktop/ms684954(v=vs.85).aspx
If the thread is busy then you will still have to wait for the current work item to be completed.
Certainly don't call TerminateThread.
Unfortunately, I/O completion port handles are always in a signaled state and as such cannot really be used in WaitFor* functions.
GetQueuedCompletionStatus[Ex] is the only way to block on the completion port. With an empty queue, the function will return only if the thread becomes alerted. As mentioned by #Ben, the QueueUserAPC will make the the thread alerted and cause GetQueuedCompletionStatus to return.
However, QueueUserAPC allocates memory and thus can fail in low-memory conditions or when memory quotas are in effect. The same holds for PostQueuedCompletionStatus. As such, using any of these functions on an exit path is not a good idea.
Unfortunately, the only robust way seems to be calling the undocumented NtAlertThread exported by ntdll.dll.
extern "C" NTSTATUS __stdcall NtAlertThread(HANDLE hThread);
Link with ntdll.lib. This function will put the target thread into an alerted state without queuing anything.

function posted to boost::asio::io_service::strand not executed

I am using boost::asio in a quite complex scenario and I am experiencing a problem where a method posted to a boost::asio::io_service::strand object is not executed although several worker threads on the io_service are running and idle.
As I said, the scenario is very complex, I'm still trying to develop a reasonably small repro scenario. But the conditions are as follows:
one io_service is running and has a work-object assigned to it
4 worker threads are assigned to the io_service (called io_service::run on each)
several strand objects are used to post numerous different tasks
in some of the tasks that are executed via the strands, new tasks are posted to the strand
The whole system works well and stable, except for one situation:
When calling the destructor of one of the classes, it posts the abort handler to the strand (to initiate aborting in sync with the other taks) and then waits until abort is done.
Every once in a while it now happens, that the abort handler is never executed (destructor is called from an invocation of another strand object).
I assume the problem is that the strand waits for executing the handler on the same thread that it has been posted. And since this thread is waiting for the abort handler to be executed the program deadlocks.
My questions now:
- is my assumption correct?
- is there any way to avoid this situation?
- how would you approach that problem (having several async tasks running and need to abort them synchronously)
Thanx a lot for your help!
m.

Clarification on Threads and Run Loops In Cocoa

I'm trying to learn about threading and I'm thoroughly confused. I'm sure all the answers are there in the apple docs but I just found it really hard to breakdown and digest. Maybe somebody could clear a thing or 2 up for me.
1)performSelectorOnMainThread
Does the above simply register an event in the main run loop or is it somehow a new thread even though the method says "mainThread"? If the purpose of threads is to relieve processing on the main thread how does this help?
2) RunLoops
Is it true that if I want to create a completely seperate thread I use
"detachNewThreadSelector"? Does calling start on this initiate a default run loop for the thread that has been created? If so where do run loops come into it?
3) And Finally , I've seen examples using NSOperationQueue. Is it true to say that If you use performSelectorOnMainThread the threads are in a queue anyway so NSOperation is not needed?
4) Should I forget about all of this and just use the Grand Central Dispatch instead?
Run Loops
You can think of a Run Loop to be an event processing for-loop associated to a thread. This is provided by the system for every thread, but it's only run automatically for the main thread.
Note that running run loops and executing a thread are two distinct concepts. You can execute a thread without running a run loop, when you're just performing long calculations and you don't have to respond to various events.
If you want to respond to various events from a secondary thread, you retrieve the run loop associated to the thread by
[NSRunLoop currentRunLoop]
and run it. The events run loops can handle is called input sources. You can add input sources to a run-loop.
PerformSelector
performSelectorOnMainThread: adds the target and the selector to a special input source called performSelector input source. The run loop of the main thread dequeues that input source and handles the method call one by one, as part of its event processing loop.
NSOperation/NSOperationQueue
I think of NSOperation as a way to explicitly declare various tasks inside an app which takes some time but can be run mostly independently. It's easier to use than to detach the new thread yourself and maintain various things yourself, too. The main NSOperationQueue automatically maintains a set of background threads which it reuses, and run NSOperations in parallel.
So yes, if you just need to queue up operations in the main thread, you can do away with NSOperationQueue and just use performSelectorOnMainThread:, but that's not the main point of NSOperation.
GCD
GCD is a new infrastructure introduced in Snow Leopard. NSOperationQueue is now implemented on top of it.
It works at the level of functions / blocks. Feeding blocks to dispatch_async is extremely handy, but for a larger chunk of operations I prefer to use NSOperation, especially when that chunk is used from various places in an app.
Summary
You need to read Official Apple Doc! There are many informative blog posts on this point, too.
1)performSelectorOnMainThread
Does the above simply register an event in the main run loop …
You're asking about implementation details. Don't worry about how it works.
What it does is perform that selector on the main thread.
… or is it somehow a new thread even though the method says "mainThread"?
No.
If the purpose of threads is to relieve processing on the main thread how does this help?
It helps you when you need to do something on the main thread. A common example is updating your UI, which you should always do on the main thread.
There are other methods for doing things on new secondary threads, although NSOperationQueue and GCD are generally easier ways to do it.
2) RunLoops
Is it true that if I want to create a completely seperate thread I use "detachNewThreadSelector"?
That has nothing to do with run loops.
Yes, that is one way to start a new thread.
Does calling start on this initiate a default run loop for the thread that has been created?
No.
I don't know what you're “calling start on” here, anyway. detachNewThreadSelector: doesn't return anything, and it starts the thread immediately. I think you mixed this up with NSOperations (which you also don't start yourself—that's the queue's job).
If so where do run loops come into it?
Run loops just exist, one per thread. On the implementation side, they're probably lazily created upon demand.
3) And Finally , I've seen examples using NSOperationQueue. Is it true to say that If you use performSelectorOnMainThread the threads are in a queue anyway so NSOperation is not needed?
These two things are unrelated.
performSelectorOnMainThread: does exactly that: Performs the selector on the main thread.
NSOperations run on secondary threads, one per operation.
An operation queue determines the order in which the operations (and their threads) are started.
Threads themselves are not queued (except maybe by the scheduler, but that's part of the kernel, not your application). The operations are queued, and they are started in that order. Once started, their threads run in parallel.
4) Should I forget about all of this and just use the Grand Central Dispatch instead?
GCD is more or less the same set of concepts as operation queues. You won't understand one as long as you don't understand the other.
So what are all these things good for?
Run loops
Within a thread, a way to schedule things to happen. Some may be scheduled at a specific date (timers), others simply “whenever you get around to it” (sources). Most of these are zero-cost when idle, only consuming any CPU time when the thing happens (timer fires or source is signaled), which makes run loops a very efficient way to have several things going on at once without any threads.
You generally don't handle a run loop yourself when you create a scheduled timer; the timer adds itself to the run loop for you.
Threads
Threads enable multiple things to happen at the exact same time on different processors. Thing 1 can happen on thread A (on processor 1) while thing 2 happens on thread B (on processor 0).
This can be a problem. Multithreaded programming is a dance, and when two threads try to step in the same place, pain ensues. This is called contention, and most discussion of threaded programming is on the topic of how to avoid it.
NSOperationQueue and GCD
You have a thing you need done. That's an operation. You can't have it done on the main thread, or you'd simply send a message like normal; you need to run it in the background, on a secondary thread.
To achieve this, express it as either an NSOperation object (you create a subclass of NSOperation and instantiate it) or a block (or both), then add it to either an NSOperationQueue (NSOperations, including NSBlockOperation) or a dispatch queue (bare block).
GCD can be used to make things happen on the main thread, as well; you can create serial queues and add blocks to them. A serial queue, as its name suggests, will run exactly one block at a time, rather than running a bunch of them in parallel.
So what should I do?
I would not recommend creating threads directly. Use NSOperationQueue or GCD instead; they force you into better thinking habits that will reduce the risk of your threaded code inducing headaches.
For things that run periodically, not fitting into the “thing I need done” model of NSOperations and GCD blocks, consider just using the run loop on the main thread. Chances are, you don't need to put it on a thread after all. A rendering loop in a 3D game, for example, can be a simple timer.

Why use ReadDirectoryChangesW asynchronously?

I've read the documentation for ReadDirectoryChangesW() and also seen the CDirectoryChangeWatcher project, but neither say why one would want to call it asynchronously. I understand that the current thread will not block, but, at least for the CDirectoryChangeWatcher code that uses a completion port, when it calls GetQueuedCompletionStatus(), that thread blocks anyway (if there are no changes).
So if I call ReadDirectoryChangesW() synchronously in a separate thread in the first place that I don't care if it blocks, why would I ever want to call ReadDirectoryChangesW() asynchronously?
When you call it asynchronously, you have more control over which thread does the waiting. It also allows you to have a single thread wait for multiple things, such as a directory change, an event, and a message. Finally, even if you're doing the waiting in the same thread that set up the watch in the first place, it gives you control over how long you're willing to wait. GetQueuedCompletionStatus has a timeout parameter that ReadDirectoryChangesW doesn't offer by itself.
You would call ReadDirectoryChangesW such that it returns its results asynchronously if you ever needed the calling thread to not block. A tautology, but the truth.
Candidates for such threads: the UI thread & any thread that is solely responsible for servicing a number of resources (Sockets, any sort of IPC, independent files, etc.).
Not being familiar with the project, I'd guess the CDirectoryChangeWatcher doesn't care if its worker thread blocks. Generally, that's the nature of worker threads.
I tried using ReadDirectoryChanges in a worker thread synchronously, and guess what, it blocked so that the thread wouldn't exit by itself at the program exit.
So if you don't want to use evil things like TerminateThread, you should use asynchronous calls.

Resources