Boost Asio, asynchronous server and video tracking - boost

I need to trasmit with a (Boost) tcp server information collected in real time by the ARToolKit video tracking library.
Which is the right way of doing it?
I'm actually doing it with Boost threads and asio, but I think that what I do is done in a bad way (even if it works)
Here is what I do to run the server (the source of the Server class is from Boost tutorial):
boost::asio::io_service io_service;
Server s(io_service, 2345);
boost::thread bt(boost::bind(&boost::asio::io_service::run, &io_service)); //server in background in a second thread
Then I start the video tracking
startTracking(); //blocking call in the main thread
defined in this way
void startTracking(){
glutInit(&argc, argv); //global and reachable
if ((gArglSettings = arglSetupForCurrentContext()) == NULL) {
fprintf(stderr, "main(): arglSetupForCurrentContext() returned error.\n");
exit(-1);}
... //init a lot of artoolkit parameters
arVideoCapStart();
argMainLoop( NULL, keyEvent, mainLoop );
}
In this (horrible) way everything works. But I would like to avoid spawning a second thread for the asio server (it is not supposed to be thrown there, as I read from the Boost doc).
Otherwise trying to put the video traking out of the main thread crashes the ARToolKit library ie:
boost::thread workerThread(startTracking);
workerThread.join();
When the join() is run the program segfaults at glutInit call

What do you think the workerThread.join() method does? Take a look at the answer to this question. So, calling the join method will cause the thread it is called from (main thread) to block and wait until the worker thread has completed. Is that what you want? If you have set up ASIO to run on that main thread, then none of the ASIO I/O socket handlers will be able to execute and thus it will appear to hang because the thread it is on is frozen from the join method. Likewise for the ARToolKit library, if the calls to it have been initiated on this main thread, then it too will appear to freeze because that thread is frozen when the join method is called.
If this is not your problem, then please provide more code.

Related

Synchronization primitives in DriverKit

In a DriverKit extension, I would like to block a call from a user client until a specific hardware interrupt fires. Since there are no semaphores available (Does the DriverKit SDK support semaphores?), I've reached for a very basic spinlock using an _Atomic(bool) member and busy waiting:
struct IVars
{
volatile _Atomic(bool) InterruptOccurred = false;
}
// In the user client method handler
{
// Clear the flag
atomic_store(&ivars->InterruptOccurred, false);
// Set up the interrupt on the device
...
// Wait for the interrupt
while (!atomic_load(&ivars->InterruptOccurred))
{
IOSleep(10);
}
}
// In the interrupt handler
{
bool expected = false;
if (atomic_compare_exchange_strong(&ivars->InterruptOccurred, &expected, true))
{
return;
}
// Proceed with normal handling if the user client method is not waiting
}
The user client method is called infrequently and the interrupt is guaranteed to fire within 100ms, so in principle busy waiting should be acceptable, but I am not very happy with the solution. I haven't worked with spinlocks before and they make me feel rather uneasy.
I would like to avoid taking an IOLock in the interrupt handler. Is there any other synchronization primitive in DriverKit I could reach for? I guess a cleaner way to handle this would be for the user client method to accept a callback that fires on the interrupt, but that would still require synchronization with the interrupt handler and would complicate the client application code.
Preliminaries
I would like to avoid taking an IOLock in the interrupt handler.
I assume you're aware that, this being DriverKit, this isn't running in the context of a primary interrupt controller, but you're already behind a layer of Mach messaging, kernel/user context switch, and IODispatchQueue serialisation?
Possible solutions:
Since there are no semaphores available[…]
OSAction
The OSAction class contains a set of methods for sleeping in a thread until the action is invoked. (WillWait/Wait/EndWait) This might be a feasible way of implementing what you're trying to do. As usual, the documentation is in the header/iig file but hasn't made it into the web-based API docs.
IODispatchQueue
As of DriverKit 21 (macOS 12), you also get Apple's simpler Sleep/Wakeup event system baked into IODispatchQueue, which you might be familiar with from the kernel. (It is also similar to pthreads condition variables.) Note you need to create the queue with the kIODispatchQueueReentrant option in this case.
From DriverKit 22 (macOS 13/iPadOS) on, there's also a version with a deadline for the sleep SleepWithDeadline.
Async callbacks
I guess a cleaner way to handle this would be for the user client method to accept a callback that fires on the interrupt, but that would still require synchronization with the interrupt handler and would complicate the client application code.
If you're happy calling the async callback in the app on every interrupt, there's not really any synchronisation needed, you can just invoke the same OSAction repeatedly. Even if you want to only invoke the async call on the "next" interrupt, atomic compare-and-swap should be sufficient for the interrupt handler to claim the OSAction* pointer.
Important note:
With all of these potential solutions except IODispatchQueue::Sleep and the async callback: bear in mind that sleeping in the context of a user client external method will block the dispatch queue and thus any other calls to external methods in that user client will fail to make progress. (As well as any other methods scheduled to that queue.)

Thread wait reasons

I've been using code that I found in the following post:
How to get thread state (e.g. suspended), memory + CPU usage, start time, priority, etc
I'm examining thread state, and there's the following enum that describes the reasons for thread 'waiting' status -
enum KWAIT_REASON
{
Executive,
FreePage,
PageIn,
PoolAllocation,
DelayExecution,
Suspended,
UserRequest,
WrExecutive,
WrFreePage,
WrPageIn,
WrPoolAllocation,
WrDelayExecution,
WrSuspended,
WrUserRequest,
WrEventPair,
WrQueue,
WrLpcReceive,
WrLpcReply,
WrVirtualMemory,
WrPageOut,
WrRendezvous,
Spare2,
Spare3,
Spare4,
Spare5,
Spare6,
WrKernel,
MaximumWaitReason
};
Can anyone explain what WrQueue is, and perhaps what the difference between WrUserRequest and UserRequest is?
The information is obtained using NtQuerySystemInformation() with SystemProcessInformation.
WrQueue this is when thread waits on KQUEUE object (look it definition in wdm.h) in kernel. this can be call to ZwRemoveIoCompletion or Win32 shell GetQueuedCompletionStatus (IOCP is exactly KQUEUE object). or thread (begining from vista) call ZwWaitForWorkViaWorkerFactory (worker factory internally use KQUEUE. also possible that thread in kernel calls KeRemoveQueue - this usually does system working threads.
WrUserRequest is used by win32k.sys subsystem. Usually this is when thread calls GetMessage. So if we view WrUserRequest we can be sure that thread is waiting for window messages.
UserRequest - this means that thread waits on some object[s] via WaitForSingleObject[Ex] or WaitForMultipleObjects[Ex] or MsgWaitForMultipleObjects[Ex] (or it equivalents)

Guaranteed way to cancel a hanging Task?

I often have to execute code on a separate thread that is long running, blocking, instable and\or has a potential to hang forever. Since the existence of TPL the internet is full of examples that nicely cancel a task with the cancellation token but I never found an example that kills a task that hangs. Code that hangs forever is likely to be expected as soon as you communicate with hardware or call some third party code. A task that hangs cannot check the cancellation token and is doomed to stay alive forever. In critical applications I equip those tasks with alive signals that are sent on regular time intervals. As soon as a hanging task is detected, it is killed and a new instance is started.
The code below shows an example task that calls a long running placeholder method SomeThirdPartyLongOperation() which has the potential to hang forever. The StopTask() first checks if the task is still running an tries to cancel it with the cancellation token. If that doesn’t work, the task hangs and the underlying thread is interrupted\aborted old school style.
private Task _task;
private Thread _thread;
private CancellationTokenSource _cancellationTokenSource;
public void StartTask()
{
_cancellationTokenSource = new CancellationTokenSource();
_task = Task.Factory.StartNew(() => DoWork(_cancellationTokenSource.Token), _cancellationTokenSource.Token, TaskCreationOptions.LongRunning, TaskScheduler.Default);
}
public void StopTask()
{
if (_task.Status == TaskStatus.RanToCompletion)
return;
_cancellationTokenSource.Cancel();
try
{
_task.Wait(2000); // Wait for task to end and prevent hanging by timeout.
}
catch (AggregateException aggEx)
{
List<Exception> exceptions = aggEx.InnerExceptions.Where(e => !(e is TaskCanceledException)).ToList(); // Ignore TaskCanceledException
foreach (Exception ex in exceptions)
{
// Process exception thrown by task
}
}
if (!_task.IsCompleted) // Task hangs and didn't respond to cancellation token => old school thread abort
{
_thread.Interrupt();
if (!_thread.Join(2000))
{
_thread.Abort();
}
}
_cancellationTokenSource.Dispose();
if (_task.IsCompleted)
{
_task.Dispose();
}
}
private void DoWork(CancellationToken cancellationToken)
{
if (string.IsNullOrEmpty(Thread.CurrentThread.Name)) // Set thread name for debugging
Thread.CurrentThread.Name = "DemoThread";
_thread = Thread.CurrentThread; // Save for interrupting/aborting if thread hangs
for (int i = 0; i < 10; i++)
{
cancellationToken.ThrowIfCancellationRequested();
SomeThirdPartyLongOperation(i);
}
}
Although I’ve been using this construct for some years now, I want to know if there are some potential mistakes in it. I’ve never seen an example of a task that saves the underlying thread or gives it a name to simplify debugging, so I’m a bit unsure if this is the right way to go. Comment on any detail is welcome!
Code that hangs forever is likely to be expected as soon as you communicate with hardware or call some third party code.
Communication: absolutely not. There's always a way to timeout with communication APIs, so even with misbehaving hardware, there's no need to force-kill an I/O operation.
Third-party code: only if you're paranoid (or have high demands such as 24x7 automation).
Here's the bottom line:
There's no way to force-kill a task.
You can force-kill a thread, but this can easily cause serious problems with application state, possibility if introducing deadlocks in other parts of the code, and resource leaks.
You can force-kill an appdomain, which solves a large portion of app state / deadlock issues with killing threads. However, it doesn't solve them all, and there's still the problem of resource leaks.
You can force-kill a process. This is the only truly clean and reliable solution.
So, if you choose to trust the third-party code, I recommend that you just call it like any other API. If you require 100% reliability regardless of third-party libraries, you'll need to wrap the third-party dll into a separate process and use cross-process communication to call it.
Your current code force-kills a thread pool thread, which is certainly not recommended; those threads belong to the thread pool, not to you, and this is still true even if you specify LongRunning. If you go the kill-thread route (which is not fully reliable), then I recommend using an explicit thread.
The question is why is this task even hanging at all? I think there's no universal solution to this problem but you should focus on the task to be always responsible and not on forcing to interrupt it.
In this code, it looks like you're looking for a simple thread rather than a task - you shouldn't link tasks to threads - it's very likely that the task will switch to another thread after some async operations and you will end up on killing an innoccent thread that is not connected to your task anymore. If you really need to kill the whole thread then make a dedicated one just for this job.
You shouldn't also name or do anything with any thread that is used for tasks' default pool. Consider this code:
static void Main(string[] args)
{
Task.Run(sth);
Console.Read();
}
static async Task sth()
{
Thread.CurrentThread.Name = "My name";
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
await Task.Delay(1);
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
Console.WriteLine(Thread.CurrentThread.Name ?? "No name");
}
the output is:
3
4
No name

WTL multithreading, multiple interfaces & libraries

I have a Main Thread that displays an interface, within another thread created from the main thread before the Main interface is shown, I create tow other windows sequentially:
I create the first window:
CWarningDlg warnDlg;
warnDlg.Create(NULL);
warnDlg.ShowWindow(SW_SHOW);
warnDlg.BringWindowToTop();
CMessageLoop _Loop ;
if(_MyAppModule.AddMessageLoop(&_Loop))
{
nRet = _Loop.Run();
_MyAppModule.RemoveMessageLoop();
}
warnDlg.DestroyWindow();
if (nRet == SOME_VALUE)
{
doSomethingElse();
}
Do something else has:
CActionDlg actDlg;
actDlg.Create(NULL);
actDlg.ShowWindow(SW_SHOW);
actDlg.BringWindowToTop();
CMessageLoop _Loop ;
if(_MyAppModule.AddMessageLoop(&_Loop))
{
CreateAnObject(); //this also launches an object Specific Worker Thread
nRet = _Loop.Run();
_MyAppModule.RemoveMessageLoop();
}
The function CreateAnObject calls some functions from a 'ComplexObject.DLL' that create an complex object which holds the THREAD ID of the thread that called the CREATION function, it gets it with ::GetCurrentThreadId(); , while creating this complex object the GetCurrentThreadId() returns the ID of the SECOND THREAD, which is GOOD.
Now, in my CActionDialog I receive notifications from this object usind ::SendMessage(), the SendMessage function is called from within a Worker thread that is specific to the Complex Object just created.
When I receive those notifications I need to access some of that complex object values, for that I call some other functions from 'ComplexObject.DLL' which verify using the ::GetCurrentThreadId() function that the ID of the calling thread is the same as the ID of thread that created that complex object. That verification fails for me, because the functions get called using the thread ID of the MAIN THREAD, that has the Main interface GUI.
Why is that? I cannot understand! (I hope I successfully explained myself).
The problem you seem to have, from your description at least, is that whatever external API you are using via CreateAnObject, it restricts its further use to creation thread. Taking it as is, you are limited to making calls from the creation thread only. Whenever your code running on other theads, including thread hosting CWarningDlg, needs to talk to this API, you need to transfer the call to the CActionDlg thread and proceed from there.
Synchronization can be SendMessage you already do, or something safer like PostMessage with event/message completion notification.

Problem with Boost Asio asynchronous connection using C++ in Windows

Using MS Visual Studio 2008 C++ for Windows 32 (XP brand), I try to construct a POP3 client managed from a modeless dialog box.
Te first step is create a persistent object -say pop3- with all that Boost.asio stuff to do asynchronous connections, in the WM_INITDIALOG message of the dialog-box-procedure. Some like:
case WM_INITDIALOG:
return (iniPop3Dlg (hDlg, lParam));
Here we assume that iniPop3Dlg() create the pop3 heap object -say pointed out by pop3p-. Then connect with the remote server, and a session is initiated with the client’s id and password (USER and PASS commands). Here we assume that the server is in TRANSACTION state.
Then, in response to some user input, the dialog-box-procedure, call the appropriate function. Say:
case IDS_TOTAL: // get how many emails in the server
total (pop3p);
return FALSE;
case IDS_DETAIL: // get date, sender and subject for each email in the server
detail (pop3p);
return FALSE;
Note that total() uses the POP3’s STAT command to get how many emails in the server, while detail() uses two commands consecutively; first STAT to get the total and then a loop with the GET command to retrieve the content of each message.
As an aside: detail() and total() share the same subroutines -the STAT handle routine-, and when finished, both leaves the session as-is. That is, without closing the connection; the socket remains opened an the server in TRANSACTION state.
When any option is selected by the first time, the things run as expected, obtaining the desired results. But when making the second chance, the connection hangs.
A closer inspection show that the first time that the statement
socket_.get_io_service().run();
Is used, never ends.
Note that all asynchronous write and read routines uses the same io_service, and each routine uses socket_.get_io_service().reset() prior to any run()
Not also that all R/W operations also uses the same timer, who is reseted to zero wait after each operation is completed:
dTimer_.expires_from_now (boost::posix_time::seconds(0));
I suspect that the problem is in the io_service or in the timer, and the fact that subsequent executions occurs in a different load of the routine.
As a first approach to my problem, I hope that someone would bring some light in it, prior to a more detailed exposition of the -very few and simple- routines involved.
Have you looked at the asio examples and studied them? There are several asynchronous examples that should help you understand the basic control flow. Pay particular importance to the main event loop started by invoking io_service::run, it's important to understand control is not expected to return to the caller until the io_service has no more remaining work to do.

Resources