The boost docs say that cancelled async connect, send and receive finish immediately, and the handlers for cancelled operations will be passed the boost::asio::error::operation_aborted error.
I would like to find out if the cancelled handler gets to run (and see the operation_aborted error) before other (non-cancelled, and newly scheduled) completion handlers run.
Here is the timeline that concerns me:
acceptHandler and readHandler are running on the same event loop and the same thread.
time t0 - readHandler is running on oldConnectionSocket
time t1 - acceptHandler runs
time t2 - acceptHandler calls oldConnectionSocket.cancel
time t3 - acceptHandler closes oldConnectionSocket
time t4 - acceptHandler calls newConnectionSocket.async_read(...readHandler...)
time t5 - readHandler is called (from which context?)
Is it possible at t5 for readHandler to be called in the newConnectionSocket context before it is called with the operation_aborted error in the oldConnectionSocket context?
Cancelled operations will immediately post their handlers for deferred invocation. However, the io_service makes no guarantees on the invocation order of handlers. Thus, the io_service could choose to invoke the ReadHandlers in either order. Currently, only a strand specifies guaranteed ordering under certain conditions.
Within a completion handler, if the goal is to know which I/O object was associated with the operation, then consider constructing the completion handler so that it has an explicit handle to the I/O object. This is often accomplished using any of the following:
a custom functor
std::bind() or boost::bind()
a C++11 lambda
One common idiom is to have the I/O object be managed by a single class that inherits from boost::enable_shared_from_this<>. When a class inherits from boost::enable_shared_from_this, it provides a shared_from_this() member function that returns a valid shared_ptr instance to this. A copy of the shared_ptr is passed to completion handlers, such as a capture-list in lambdas or passed as the instance handle to boost::bind(). This allows for the handlers to know the I/O object on which the operation was performed, and causes the lifetime of the I/O object to be extended to at least as long as the handler. See the Boost.Asio asynchronous TCP daytime server tutorial for an example using this approach.
class tcp_connection
: public boost::enable_shared_from_this<tcp_connection>
{
public:
// ...
void start()
{
boost::asio::async_write(socket_, ...,
boost::bind(&tcp_connection::handle_write, shared_from_this(),
boost::asio::placeholders::error,
boost::asio::placeholders::bytes_transferred));
}
void handle_write(
const boost::system::error_code& error,
std::size_t bytes_transferred)
{
// I/O object is this->socket_.
}
tcp::socket socket_;
};
On the other hand, if the goal is to determine if one handler has executed before the other, then:
the application will need to explicitly manage the state
trying to manage multiple dependent call chains may be introducing unnecessary complexity and often indicates a need to reexamine the design
custom handlers could be used to prioritize the order in which handlers are executed. The Boost.Asio Invocation example uses custom handlers that are added to a priority queue, which are then executed at a later point in time.
Related
In a DriverKit extension, I would like to block a call from a user client until a specific hardware interrupt fires. Since there are no semaphores available (Does the DriverKit SDK support semaphores?), I've reached for a very basic spinlock using an _Atomic(bool) member and busy waiting:
struct IVars
{
volatile _Atomic(bool) InterruptOccurred = false;
}
// In the user client method handler
{
// Clear the flag
atomic_store(&ivars->InterruptOccurred, false);
// Set up the interrupt on the device
...
// Wait for the interrupt
while (!atomic_load(&ivars->InterruptOccurred))
{
IOSleep(10);
}
}
// In the interrupt handler
{
bool expected = false;
if (atomic_compare_exchange_strong(&ivars->InterruptOccurred, &expected, true))
{
return;
}
// Proceed with normal handling if the user client method is not waiting
}
The user client method is called infrequently and the interrupt is guaranteed to fire within 100ms, so in principle busy waiting should be acceptable, but I am not very happy with the solution. I haven't worked with spinlocks before and they make me feel rather uneasy.
I would like to avoid taking an IOLock in the interrupt handler. Is there any other synchronization primitive in DriverKit I could reach for? I guess a cleaner way to handle this would be for the user client method to accept a callback that fires on the interrupt, but that would still require synchronization with the interrupt handler and would complicate the client application code.
Preliminaries
I would like to avoid taking an IOLock in the interrupt handler.
I assume you're aware that, this being DriverKit, this isn't running in the context of a primary interrupt controller, but you're already behind a layer of Mach messaging, kernel/user context switch, and IODispatchQueue serialisation?
Possible solutions:
Since there are no semaphores available[…]
OSAction
The OSAction class contains a set of methods for sleeping in a thread until the action is invoked. (WillWait/Wait/EndWait) This might be a feasible way of implementing what you're trying to do. As usual, the documentation is in the header/iig file but hasn't made it into the web-based API docs.
IODispatchQueue
As of DriverKit 21 (macOS 12), you also get Apple's simpler Sleep/Wakeup event system baked into IODispatchQueue, which you might be familiar with from the kernel. (It is also similar to pthreads condition variables.) Note you need to create the queue with the kIODispatchQueueReentrant option in this case.
From DriverKit 22 (macOS 13/iPadOS) on, there's also a version with a deadline for the sleep SleepWithDeadline.
Async callbacks
I guess a cleaner way to handle this would be for the user client method to accept a callback that fires on the interrupt, but that would still require synchronization with the interrupt handler and would complicate the client application code.
If you're happy calling the async callback in the app on every interrupt, there's not really any synchronisation needed, you can just invoke the same OSAction repeatedly. Even if you want to only invoke the async call on the "next" interrupt, atomic compare-and-swap should be sufficient for the interrupt handler to claim the OSAction* pointer.
Important note:
With all of these potential solutions except IODispatchQueue::Sleep and the async callback: bear in mind that sleeping in the context of a user client external method will block the dispatch queue and thus any other calls to external methods in that user client will fail to make progress. (As well as any other methods scheduled to that queue.)
Reference:
websocket_client_async_ssl.cpp
strands
Question 1> Here is my understanding:
Given a few async operations bound with the same strand, the strand
will guarantee that all associated async operations will be executed
as a strictly sequential invocation.
Does this mean that all above async operations will be executed by a same thread?
Or it just says that at any time, only one asyn operation will be executed by any available thread?
Question 2> The boost::asio::make_strand function creates a strand object for an executor or execution context.
session(net::io_context& ioc, ssl::context& ctx)
: resolver_(net::make_strand(ioc))
, ws_(net::make_strand(ioc), ctx)
Here, resolver_ and ws_ have its own strand, but I have problems to understand how each strand applies to what asyn operations.
For example, in the following aysnc and handler, which functions(i.e aysnc or handler) are bound to the same strand and will not run simultaneously.
run
=>resolver_.async_resolve -->session::on_resolve
=>beast::get_lowest_layer(ws_).async_connect -->session::on_connect
=>ws_.next_layer().async_handshake --> session::on_ssl_handshake
=>ws_.async_handshake --> session::on_handshake
async ================================= handler
Question 3> How can we retrieve the strand from executor?
Is there any difference between these two?
get_associated_executor
get_executor
io_context::get_executor: Obtains the executor associated with the
io_context.
get_associated_executor: Helper function to obtain an object's
associated executor.
Question 4> Is it correct that I use the following method to bind deadline_timer to io_context to prevent race condition?
All other parts of the code is same as the example of websocket_client_async_ssl.cpp.
session(net::io_context& ioc, ssl::context& ctx)
: resolver_(net::make_strand(ioc))
, ws_(net::make_strand(ioc), ctx),
d_timer_(ws_.get_executor())
{ }
void on_heartbeat_write( beast::error_code ec, std::size_t bytes_transferred)
{
d_timer_.expires_from_now(boost::posix_time::seconds(5));
d_timer_.async_wait(beast::bind_front_handler( &session::on_heartbeat, shared_from_this()));
}
void on_heartbeat(const boost::system::error_code& ec)
{
ws_.async_write( net::buffer(text_ping_), beast::bind_front_handler( &session::on_heartbeat_write, shared_from_this()));
}
void on_handshake(beast::error_code ec)
{
d_timer_.expires_from_now(boost::posix_time::seconds(5));
d_timer_.async_wait(beast::bind_front_handler( &session::on_heartbeat, shared_from_this()));
ws_.async_write(net::buffer(text_), beast::bind_front_handler(&session::on_write, shared_from_this()));
}
Note:
I used d_timer_(ws_.get_executor()) to init deadline_timer and hoped that it will make sure they don't write or read the websocket at the same time.
Is this the right way to do it?
Question 1
Does this mean that all above async operations will be executed by a same thread? Or it just says that at any time, only one async operation will be executed by any available thread?
The latter.
Question 2
Here, resolver_ and ws_ have its own strand,
Let me interject that I think that's unnecessarily confusing in the example. They could (should, conceptually) have used the same strand, but I guess they didn't want to go through the trouble of storing a strand. I'd probably have written:
explicit session(net::io_context& ioc, ssl::context& ctx)
: resolver_(net::make_strand(ioc))
, ws_(resolver_.get_executor(), ctx) {}
The initiation functions are called where you decide. The completion handlers are dispatch-ed on the executor that belongs to the IO object that you call the operation on, unless the completion handler is bound to a different executor (e.g. using bind_executor, see get_associated_exectutor). In by far the most circumstances in modern Asio, you will not bind handlers, instead "binding IO objects" to the proper executors. This makes it less typing, and much harder to forget.
So in effect, all the async-initiations in the chain except for the one in run() are all on a strand, because the IO objects are tied to strand executors.
You have to keep in mind to dispatch on a strand when some outside user calls into your classes (e.g. often to stop). It is a good idea therefore to develop a convention. I'd personally make all the "unsafe" methods and members private:, so I will often have pairs like:
public:
void stop() {
dispatch(strand_, [self=shared_from_this()] { self->do_stop(); });
}
private:
void do_stop() {
beast::get_lowest_layer(ws_).cancel();
}
Side Note:
In this particular example, there is only one (main) thread running/polling the io service, so the whole point is moot. But as I explained recently (Does mulithreaded http processing with boost asio require strands?), the examples are here to show some common patterns that allow one to do "real life" work as well
Bonus: Handler Tracking
Let's use BOOST_ASIO_ENABLE_HANDLER_TRACKING to get some insight.¹ Running a sample session shows something like
If you squint a little, you can see that all the strand executors are the same:
0*1|resolver#0x559785a03b68.async_resolve
1*2|strand_executor#0x559785a02c50.execute
2*3|socket#0x559785a05770.async_connect
3*4|strand_executor#0x559785a02c50.execute
4*5|socket#0x559785a05770.async_send
5*6|strand_executor#0x559785a02c50.execute
6*7|socket#0x559785a05770.async_receive
7*8|strand_executor#0x559785a02c50.execute
8*9|socket#0x559785a05770.async_send
9*10|strand_executor#0x559785a02c50.execute
10*11|socket#0x559785a05770.async_receive
11*12|strand_executor#0x559785a02c50.execute
12*13|deadline_timer#0x559785a05958.async_wait
12*14|socket#0x559785a05770.async_send
14*15|strand_executor#0x559785a02c50.execute
15*16|socket#0x559785a05770.async_receive
16*17|strand_executor#0x559785a02c50.execute
17*18|socket#0x559785a05770.async_send
13*19|strand_executor#0x559785a02c50.execute
18*20|strand_executor#0x559785a02c50.execute
20*21|socket#0x559785a05770.async_receive
21*22|strand_executor#0x559785a02c50.execute
22*23|deadline_timer#0x559785a05958.async_wait
22*24|socket#0x559785a05770.async_send
24*25|strand_executor#0x559785a02c50.execute
25*26|socket#0x559785a05770.async_receive
26*27|strand_executor#0x559785a02c50.execute
23*28|strand_executor#0x559785a02c50.execute
Question 3
How can we retrieve the strand from executor?
You don't[*]. However make_strand(s) returns an equivalent strand if s is already a strand.
[*] By default, Asio's IO objects use the type-erased executor (asio::executor or asio::any_io_executor depending on version). So technically you could ask it about its target_type() and, after comparing the type id to some expected types use something like target<net::strand<net::io_context::executor_type>>() to access the original, but there's really no use. You don't want to be inspecting the implementation details. Just honour the handlers (by dispatching them on their associated executors like Asio does).
Is there any difference between these two? get_associated_executor get_executor
get_executor gets an owned executor from an IO object. It is a member function.
asio::get_associated_executor gets associated executors from handler objects. You will observe that get_associated_executor(ws_) doesn't compile (although some IO objects may satisfy the criteria to allow it to work).
Question 4
Is it correct that I use the following method to bind deadline_timer to io_context
You will notice that you did the same as I already mentioned above to tie the timer IO object to the same strand executor. So, kudos.
to prevent race condition?
You don't prevent race conditions here. You prevent data races. That is because in on_heartbeat you access the ws_ object which is an instance of a class that is NOT threadsafe. In effect, you're sharing access to non-threadsafe resources, and you need to serialize access, hence you want to be on the strand that all other accesses are also on.
Note: [...] and hoped that it will make sure they don't write or read the websocket at the same time. Is this the right way to do it?
Yes this is a good start, but it is not enough.
Firstly, you can write or read at the same time, as long as
write operations don't overlap
read operations don't overlap
accesses to the IO object are safely serialized.
In particular, your on_heartbeat might be safely serialized so you don't have a data race on calling the async_write initiation function. However, you need more checks to know whether a write operation is already (still) in progress. One way to achieve that is to have a queue with outgoing messages. If you have strict heartbeat requirements and high load, you might need a priority-queue here.
¹ I simplified the example by replacing the stream type with the Asio native ssl::stream<tcp::socket>. This means we don't get all the internal timers that deal with tcp_stream expirations. See https://pastebin.ubuntu.com/p/sPRYh6Xbwz/
Simple Question... is a global BOOL thread safe for me to use for thread synchronization?
What other data types are actually safe, e.g. long longs..?
Eg:
I have a task that runs - only want it to run once concurrently.
<pre>
BOOL isRunning;
unsigned long long progress;
if(!isRunning){
dispatch_async(secondaryTask,^{
[self doWork];
});
-(void)doWork
{
isRunning=TRUE;
do a long op
isRunning=FALSE;
}
</pre>
For the atomic types, exactly the same rules as ordinary C apply. So there's no guarantee of thread safety on any of them.
Use OSAtomic, NSConditionLock, the NSLocking protocol, serial dispatch queues, individual runloops, memory fences, spin locks, etc, to achieve thread safety.
For the trivial code given, which I accept is probably just for exposition, you'd most likely provide a completion handler block, which the asynchronous block would dispatch upon completion. If it's a serial queue, just push the task to it. Consider a dispatch group if you want synchronisation points within concurrent task groups.
Does the method post() method of the object boost::asio::io_service uses the boost::coroutines to perform queue of short-tasks performed in the handlers? This can save the resources spent on synchronization when using threads, but makes it impossible to move tasks to another thread. Or it makes no sense?
As best as I could tell, Boost.Asio does not use coroutines.
From an implementation point of view, I would imagine using a coroutine, such as those provided by Boost.Coroutine, would introduce overhead when invoking posted handlers. At the point in which the event loop knows what handlers can be invoked, it could simple invoke the handler rather than having to hoist the handler in a trampoline function so that it can be transparently invoked within the context of a coroutine.
Boost.Asio does not know the actual or expected runtime duration of handlers, so it must perform the same internal synchronization regardless of the handlers. When the io_service is only being processed by a single thread, then synchronization overhead can be mitigated by providing a concurrency_hint during construction. Other areas, such as the reactor, may still need to perform synchronization.
In the end, rather than imposing context of execution, Boost.Asio provides a robust toolkit and empowers the users to choose the best option for themselves. The current Boost.Asio candidate for Boost 1.54 enhances this experience through its first-class support for:
Stackful Coroutines based on Boost.Coroutine. Here is an example where do_echo executes within the context of my_strand as a coroutine. Each async operation yields control back to the calling thread after initiating the asynchronous operation, and when the completion handler is invoked, control returns to immediately following the previous yield point.
boost::asio::spawn(my_strand, do_echo);
// ...
void do_echo(boost::asio::yield_context yield)
{
try
{
char data[128];
for (;;)
{
std::size_t length =
my_socket.async_read_some(
boost::asio::buffer(data), yield);
boost::asio::async_write(my_socket,
boost::asio::buffer(data, length), yield);
}
}
catch (std::exception& e)
{
// ...
}
}
Boost.Asio provides a complete echo_service example that uses Stackful Coroutines.
Stackless Coroutines have been promoted to the documented public API from the HTTP Server 4 example. These are implemented as a variant of Duff's Device, but the details are cleanly hidden through the use of pseudo-keywords reenter, yield, and fork. This following is roughly the equivalent of the above Stackful Coroutine example:
struct session : boost::asio::coroutine
{
tcp::socket my_socket_;
char data_[128];
// ...
void operator()(boost::system::error_code ec = boost::system::error_code(),
std::size_t length = 0)
{
if (!ec) reenter (this)
{
for (;;)
{
yield my_socket_.async_read_some(
boost::asio::buffer(data_), *this);
yield boost::asio::async_write(my_socket_,
boost::asio::buffer(data_, length), *this);
}
}
}
};
See the boost::asio::coroutine documentation for more details.
While I do not know if there are performance benefits to constructing asynchronous call chains with coroutines, I feel as though their greatest contribution is maintainability and readability. I have found that being able to read and write asynchronous programs in a synchronous manner helps reduce the complexities introduced with inverted flow of control, as it is now possible to remove the spacial separation between operation initiation and completion.
In asio:: io_service I insert objects. asio:: io_service::run() runs in several threads.
The possibility to expect the completion of any object in queue is necessary.
For example:
template <typename T>
struct handler {
void operator()() {
....
}
T get() const {...}
};
asio::io_service ios;
ios.post(handler());
How I can refer to the object in queue?
How can I pause the main program loop, untill handler::operator() is executed?
Thanks.
Here's what I know so far:
1. Several handlers are executing on several threads.
2. Handlers run independently of each other. There is no synchronization needed between threads for data/race conditions.
3. Get can be called on anyone handler. When called the handler should stop calculating and let another thread call handler::get().
This really seems more like a multi-threading / concurrency question then a boost::asio question. At the moment I do not see a need to use an io_service. It seems like several threads could just be started without an io_service and some synchronization could be used between the threads.
The thread calling Handler::get() needs to wait until the io_service thread running operator() completes its calculation before it can return.
Consider using a condition variable. The handler::get() method can wait until the condition is met (i.e. operator() finishes its calculation). The io_service thread that runs operator() would notify the main thread through the condition variable.
An example of one thread notifying another thread via a condition variable is here.