When to use loop.add_signal_handler? - python-asyncio

I noticed the asyncio library has a loop.add_signal_handler(signum, callback, *args) method.
So far I have just been catching unix signals in the main file using the signals module in with my asynchronous code like this:
signal.signal(signal.SIGHUP, callback)
async def main():
...
Is that an oversight on my part?

The add_signal_handler documentation is sparse1, but looking at the source, it appears that the main added value compared to signal.signal is that add_signal_handler will ensure that the signal wakes up the event loop and allow the loop to invoke the signal handler along with other queued callbacks and runnable coroutines.
So far I have just been catching unix signals in the main file using the signals module [...] Is that an oversight on my part?
That depends on what the signal handler is doing. Printing a message or updating a global is fine, but if it is invoking anything in any way related to asyncio, it's most likely an oversight. A signal can be delivered at (almost) any time, including during execution of an asyncio callback, a coroutine, or even during asyncio's own bookkeeping.
For example, the implementation of asyncio.Queue freely assumes that the access to the queue is single-threaded and non-reentrant. A signal handler adding something to a queue using q.put_nowait() would be disastrous if it interrupted an on-going invocation of q.put_nowait() on the same queue. Similar to typical race conditions experienced in multi-threaded code, an interruption in the middle of assignment to _unfinished_tasks might well cause it to get incremented only once instead of twice (once for each put_nowait).
Asyncio code is designed for cooperative multi-tasking, where the points where a function may suspend defined are clearly denoted by the await and related keywords. The add_signal_handler function ensures that your signal handler gets invoked at such a point, and that you're free to implement it as you'd implement any other asyncio callback.
1 When this answer was originally written, the add_signal_handler documentation was briefer than today and didn't cover the difference to signal.signal at all. This question prompted it getting expanded in the meantime.

Related

pthread_kill to a GCD-managed thread

I am attempting to send a signal to a specific thread with pthread_kill. I use pthread_from_mach_thread_np() to get a handle and then use pthread_kill to send the signal.
This worked well in my other testing, but now I see that when attempting to signal a thread internally created by GCD, I get a return code of 45 from pthread_kill.
GCD API that spawned that thread:
dispatch_async(dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0), ^{ ... });
Any reason this is happening?
—-
To add some further information, I am not attempting to kill threads. pthread_kill() is the standard POSIX API to send signals to threads. If a signal handler is installed, the thread’s context is switched with a trampoline to the handler.
While what I attempt to achieve using my signal handler can be achieved in better ways, this is not in question here. Even if for purely academic reasons, I would like to understand what is going on here internally.
The pthread_kill() API is specifically disallowed on workqueue threads (the worker threads underlying GCD) and returns ENOTSUP for such threads.
This is primarily intended to prevent execution of arbitrary signal handlers in the context of code that may not expect it (since these threads are a shared resource used by many independent subsystems in a process), as well as to abstract away that execution context so that the system has the freedom to change it in the future.
You can see the details of how this is achieved in the implementation.
That is a very bad idea. You don't own GCDs thread-pool, and you absolutely must not kill its threads out from under it.
The answer to your question is DO NOT DO THAT UNDER ANY CIRCUMSTANCES.

How to call a function in context of another thread?

I remember there was a way to do this, something similar to unix signals, but not so widely used. But can't remember the term. No events/mutexes are used: the thread is just interrupted at random place, the function is called and when it returns, the thread continues.
Windows has Asynchronous Procedure Calls which can call a function in the context of a specific thread. APC's do not just interrupt a thread at a random place (that would be dangerous - the thread could be in the middle of writing to a file or obtaining a lock or in Kernel mode). Instead an APC will be dispatched when the calling thread enters an alterable wait by calling a specific function (See the APC documentation).
If the reason that you need to call code in a specific thread is because you are interacting with the user interface, it would be more direct to send or post a window message to the window handle that you want to update. Window messages are always processed in the thread that created the window.
you can search RtlRemoteCall, it's an undocumented routine though. there's APC in Windows semantically similar to Unix signal, however APC requires target thread is in an alertable state to get delivered, it's not guaranteed this condition is always met

What happens when an async_write() operation never ends and there is a strand involved?

I know that the next async_write()'s should be performed when the previous one finished (with or without errors, but when it finished).
I would like to know what happens when, while making async_write() calls, if one of these takes long time for some reason or even never ends (I assume there is no timeouts here like in synchronous operations). When this operation will be considered as failed? When that operation that never ends is finally removed by the OS internally?
Maybe, are there timeouts involved and my assumptions are wrong?
I mean, the write operation is sent to the OS and could possibly block, indefinitely?
So the handler is never called and the next async_write()'s are never called.
NOTE: I am assuming that we are calling run() in several threads but the write operations should be sent in order so I am also assuming that the write handlers are wrapped with a strand.
Thank you for your time.
There are no explicit timeouts for asynchronous operations, but they can be cancelled through the IO object's cancel() member function. These operations will be considered as having failed only when the underlying OS call itself fails in a manner where a retry cannot reasonable occur. For example, if the write fails from:
EINTR, then the write will immediately be reattempted.
EWOULDBLOCK, EAGAIN, or ERROR_RETRY, then Boost.Asio will push the operation back into the job queue. This could occur if the write buffer was full, so pushing the operation back into the queue defers its reattempt, allowing other operations to be attempted.
Other errors will cause the operation to fail.
There should not be an indefinitely block in the system call. Boost.Asio sets the underlying IO objects to non-blocking, and provides synchronous blocking writes behavior by waiting on the associated file descriptor if a write failed with EWOULDBLOCK, EAGAIN, or ERROR_RETRY.
A strand is not affected by long term asynchronous operations. Strands are used to provide strict sequential invocation of handlers, not the operations themselves. In the case of composed operations, such as boost::asio::async_write, the intermediate handlers will also be invoked through the same strand as the final handler. Overall, this behavior helps provide thread safety, as:
All async_write_some operations initiated from intermediate handlers are within the strand.
The operation itself is not within the strand. This allows other for other handlers to run while the actual write is occurring.
The user handler will be invoked within the strand.
This answer may provide some more insight into composed operations and strands.

What's the proper way to use coroutines for event handling?

I'm trying to figure out how to handle events using coroutines (in Lua). I see that a common way of doing it seems to be creating wrapper functions that yield the current coroutine and then resume it when the thing you're waiting for has occured. That seems like a nice solution, but what about these problems? :
How do you wait for multiple events at the same time, and branch depending on which one comes first? Or should the program be redesigned to avoid such situations?
How to cancel the waiting after a certain period? The event loop can have timeout parameters in its socket send/receive wrappers, but what about custom events?
How do you trigger the coroutine to change its state from outside? For example, I would want a function that when called, would cause the coroutine to jump to a different step, or start waiting for a different event.
EDIT:
Currently I have a system where I register a coroutine with an event, and the coroutine gets resumed with the event name and info as parameters every time the event occurs. With this system, 1 and 2 are not issues, and 3 can solved by having the coro expect a special event name that makes it jump to the different step, and resuming it with that name as an arg. Also custom objects can have methods to register event handlers the same way.
I just wonder if this is considered the right way to use coroutines for event handling. For example, if I have a read event and a timer event (as a timeout for the read), and the read event happens first, I have to manually cancel the timer. It just doesn't seem to fit the sequential nature or handling events with coroutines.
How do you wait for multiple events at the same time, and branch depending on which one comes first?
If you need to use coroutines for this, rather than just a Lua function that you register (for example, if you have a function that does stuff, waits for an event, then does more stuff), then this is pretty simple. coroutine.yield will return all of the values passed to coroutine.resume when the coroutine is resumed.
So just pass the event, and let the script decide for itself if that's the one it's waiting for or not. Indeed, you could build a simple function to do this:
function WaitForEvents(...)
local events = {...}
assert(#... ~= 0, "You must pass at least one parameter")
do
RegisterForAnyEvent(coroutine.running()) --Registers the coroutine with the system, so that it will be resumed when an event is fired.
local event = coroutine.yield()
for i, testEvt in ipairs(events) do
if(event == testEvt) then
return
end
end
until(false)
end
This function will continue to yield until one of the events it is given has been fired. The loop assumes that RegisterForAnyEvent is temporary, registering the function for just one event, so you need to re-register every time an event is fired.
How to cancel the waiting after a certain period?
Put a counter in the above loop, and leave after a certain period of time. I'll leave that as an exercise for the reader; it all depends on how your application measures time.
How do you trigger the coroutine to change its state from outside?
You cannot magic a Lua function into a different "state". You can only call functions and have them return results. So if you want to skip around within some process, you must write your Lua function system to be able to be skippable.
How you do that is up to you. You could have each set of non-waiting commands be a separate Lua function. Or you could just design your wait states to be able to skip ahead. Or whatever.

what is impact if i call syscall(SYS_gettid) from signal Handler?

Can some one tell me what could be the adverse effect of calling syscall(SYS_gettid) from Signal Handler?
I know it is not in the safe functions list to be called from signal handler but I want to know reason behind it?
I'm pretty sure this has to do with the Signal Handler methods being reentrant. Suppose a signal is sent, and your handler grabs the signal and starts processing. While processing, another signal may be sent by a concurrent program, and your handler again grabs that signal, and starts processing it.
Depending on how the scheduling works out, it's possible that the same chunk of code, the Signal Handler, executes during its own execution. The problem is that it uses the same pointers and variables, so it can corrupt itself, especially because gettid() returns the ID of the current thread. Which is the current thread in this case?

Resources