Asyncio consumer from producer thread not working - python-asyncio

I have 2 main threads (consumer/producer) communicating via a SimpleQueue. I want the consumers to execute as fast as something is added to the queue. I also want to avoid asyncio.Queue since I want to keep consumer and producer decoupled and flexible for future changes.
I started looking in gevent, asyncio, etc. but it all feels very confusing to me.
from queue import SimpleQueue
from time import sleep
import threading
q = SimpleQueue()
q.put(1)
q.put(2)
q.put(3)
def serve_forever():
import asyncio
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
while True:
task = q.get()
print(f"Dequeued task {task}")
async def run_task():
print(f"Working on task {task}")
loop.create_task(run_task()) # run task
thread = threading.Thread(target=serve_forever)
thread.daemon = True
thread.start()
sleep(1)
Output:
Dequeued task 1
Dequeued task 2
Dequeued task 3
Why doesn't run_task execute in my case?

Simply calling create_task doesn't actually run anything; you need to have a running asyncio event loop, which you get by calling something like asyncio.run or loop.run_until_complete.
You don't need to create an explicit loop as you're doing, either; asyncio provides a default event loop.
Lastly, asyncio tasks won't run if you're never calling await, because this is how the current task yields execution time to other tasks. So even if we fix the earlier problems, your tasks will never execute because execution will be stuck inside your while loop. We need to be able to await on the q.get() calls, which isn't directly possible when using queue.SimpleQueue.
We can solve the above -- while still using queue.SimpleQueue -- by using the run_in_executor method to run the non-async q.get calls (this runs the calls in a separate thread and allows us to asynchronously wait for the result). The following code works as I think you intended:
import asyncio
import threading
import queue
q = queue.SimpleQueue()
q.put(1)
q.put(2)
q.put(3)
async def run_task(task):
print(f"Working on task {task}")
async def serve_forever():
loop = asyncio.get_event_loop()
while True:
task = await loop.run_in_executor(None, lambda: q.get())
print(f"Dequeued task {task}")
asyncio.create_task(run_task(task)) # run task
def thread_main():
asyncio.run(serve_forever())
thread = threading.Thread(target=thread_main)
thread.daemon = True
thread.start()
# just hang around waiting for thread to exit (which it never will)
thread.join()
Output:
Dequeued task 1
Working on task 1
Dequeued task 2
Working on task 2
Dequeued task 3
Working on task 3

Related

Asyncio event loop within a thread issue

Trying to create a event loop inside a thread, where the thread is initiated within the constructor of a class. I want to run multiple tasks within the event loop. However, having an issue whenever I try to run with the thread and get the error "NoneType object has no attribute create_task"
Is there something I am doing wrong in calling it.
import asyncio
import threading
Class Test():
def __init__(self):
self.loop = None
self.th = threading.Thread(target=self.create)
self.th.start()
def __del__(self):
self.loop.close()
def self.create(self):
self.loop = new_event_loop()
asyncio.set_event_loop(self.loop)
def fun(self):
task = self.loop.create_task(coroutine)
loop.run_until_complete(task)
def fun2(self):
task = self.loop.create_task(coroutine)
loop.run_until_complete(task)
t = Test()
t.fun()
t.fun2()
It is tricky to combine threading and asyncio, although it can be useful if done properly.
The code you gave has several syntax errors, so obviously it isn't the code you are actually running. Please, in the future, check your post carefully out of respect for the time of those who answer questions here. You'll get better and quicker answers if you spot these avoidable errors yourself.
The keyword "class" should not be capitalized.
The class definition does not need empty parenthesis.
The function definition for create should not have self. in front of it.
There is no variable named coroutine defined in the script.
The next problem is the launching of the secondary thread. The method threading.Thread.start() does not wait for the thread to actually start. The new thread is "pending" and will start sometime soon, but you don't have control over when that happens. So start() returns immediately; your __init__ method returns; and your call to t.fun() happens before the thread starts. At that point self.loop is in fact None, as the error message indicates.
An nice way to overcome this is with a threading.Barrier object, which can be used to insure that the thread has started before the __init__ method returns.
Your __del__ method is probably not necessary, and will normally only get executed during program shut down. If it runs under any other circumstances, you will get an error if you call loop.close on a loop that is still running. I think it's better to insure that the thread shuts down cleanly, so I've provided a Test.close method for that purpose.
Your functions fun and fun2 are written in a way that makes them not very useful. You start a task and then you immediately wait for it to finish. In that case, there's no good reason to use asyncio at all. The whole idea of asyncio is to run more than one task concurrently. Creating tasks one at a time and always waiting for each one to finish doesn't make a lot of sense.
Most asyncio functions are not threadsafe. You have to use the two important methods loop.call_soon_threadsafe and asyncio.run_coroutine_threadsafe if you want to run asyncio code across threads. The methods fun and fun2 execute in the main thread, so you should use run_coroutine_threadsafe to launch tasks in the secondary thread.
Finally, with such programs it's usually a good idea to provide a thread shutdown method. In the following listing, close obtains a list of all the running tasks, sends a cancel message to each, and then sends the stop command to the loop itself. Then it waits for the thread to really exit. The main thread will be blocked until the secondary thread is finished, so the program will shut down cleanly.
Here is a simple working program, with all the functionality that you seem to want:
import asyncio
import threading
async def coro(s):
print(s)
await asyncio.sleep(3.0)
class Test:
def __init__(self):
self.loop = None
self.barrier = threading.Barrier(2) # Added
self.th = threading.Thread(target=self.create)
self.th.start()
self.barrier.wait() # Blocks until the new thread is running
def create(self):
self.loop = asyncio.new_event_loop()
asyncio.set_event_loop(self.loop)
self.barrier.wait()
print("Thread started")
self.loop.run_forever()
print("Loop stopped")
self.loop.close() # Clean up loop resources
def close(self): # call this from main thread
self.loop.call_soon_threadsafe(self._close)
self.th.join() # Wait for the thread to exit (insures loop is closed)
def _close(self): # Executes in thread self.th
tasks = asyncio.all_tasks(self.loop)
for task in tasks:
task.cancel()
self.loop.call_soon(self.loop.stop)
def fun(self):
return asyncio.run_coroutine_threadsafe(coro("Hello 1"), self.loop)
def fun2(self):
return asyncio.run_coroutine_threadsafe(coro("Hello 2"), self.loop)
t = Test()
print("Test constructor complete")
t.fun()
fut = t.fun2()
# Comment out the next line if you don't want to wait here
# fut.result() # Wait for fun2 to finish
print("Closing")
t.close()
print("Finished")

Can I use multiple event loops in a program where I also use multiprocessing module

Thanks for any reply in advance.
I have the entrance program main.py:
import asyncio
from loguru import logger
from multiprocessing import Process
from app.events import type_a_tasks, type_b_tasks, type_c_tasks
def run_task(task):
loop = asyncio.get_event_loop()
loop.run_until_complete(task())
loop.run_forever()
def main():
processes = list()
processes.append(Process(target=run_task, args=(type_a_tasks,)))
processes.append(Process(target=run_task, args=(type_b_tasks,)))
processes.append(Process(target=run_task, args=(type_c_tasks,)))
for process in processes:
process.start()
logger.info(f"Started process id={process.pid}, name={process.name}")
for process in processes:
process.join()
if __name__ == '__main__':
main()
where the different types of tasks are similarly defined, for example type_a_tasks are:
import asyncio
from . import business_1, business_2, business_3, business_4, business_5, business_6
async def type_a_tasks():
tasks = list()
tasks.append(asyncio.create_task(business_1.main()))
tasks.append(asyncio.create_task(business_2.main()))
tasks.append(asyncio.create_task(business_3.main()))
tasks.append(asyncio.create_task(business_4.main()))
tasks.append(asyncio.create_task(business_5.main()))
tasks.append(asyncio.create_task(business_6.main()))
await asyncio.wait(tasks)
return tasks
where the main() function of businesses(1-6) are Future objects provided by asyncio, in which I implemented my business code.
Is my usage of multiprocessing and asyncio event loops above the correct way of doing it?
I am doing so because I have a lot of asynchronous tasks to perform, but it doesn't seem appropriate to put them all in one event loop, so I divided them into three parts(a, b and c) accordingly, and I hope they can be run in three different processes to exert the capability of multiple CPU cores, in the meantime taking advantage of asyncio features.
I tried running my code, where the log records show there actually are different processes but all are using the same thread/event loop(knowing this by adding process_id and thread_id to loguru format)
this seens ok. Just use asyncio.run(task()) inside run_task - it is simpler and there is no need to call run_forever (also, with the run_forever` call, your processes will never join the base one.
IDs for other objects across process may repeat - if you want, add to your logging the result of calling os.getpid() in the body of run_task.
(if these are, by chance, the same, that means that somehow subprocessing is using a "dummy" backend due to some configuration in your project - should not happen anyway)

asyncio - creating a task but never using await

I wanted to know what happens when i call asyncio.create_task and never actually calling await on this created task.
i have this simple program:
import asyncio
async def simple_task() -> None:
while True:
print("before sleeping")
await asyncio.sleep(2)
print("after sleeping")
async def test():
task = asyncio.create_task(simple_task())
if __name__ == "__main__":
asyncio.run(test())
I'm creating a task, which should run forever, and im never using await.
When running this, the output i get is:
before sleeping
Process finished with exit code 0
So my question is:
why did the task actually run when i never called await?
if it did start running, why did it stop after the sleep?
This code is written to say “run until test completes”. test creates a task to run simple_task and then ends. The event loop gives anything already scheduled against it one last chance to run before stopping.
At this point simple_task gets to start executing. It prints, and then yields control back to the loop via asyncio.sleep.
Nothing else was scheduled against the loop so it stops. asyncio.run then closes the loop.
If you schedule more things against the event loop you may see more iterations of simple_task, but there’s really no way to have asyncio run something forever without waiting for it to do so (loop.run_forever is probably the closest you’ll get).

Using wait_for with timeouts with list of tasks

So, I have a list of tasks which I want to schedule concurrently in a non-blocking fashion.
Basically, gather should do the trick.
Like
tasks = [ asyncio.create_task(some_task()) in bleh]
results = await asyncio.gather(*tasks)
But then, I also need a timeout. What I want is that any task which takes > timeout time cancels and I proceed with what I have.
I fould asyncio.wait primitive.
https://docs.python.org/3/library/asyncio-task.html#waiting-primitives
But then the doc says:
Run awaitable objects in the aws set concurrently and block until the condition specified by return_when.
Which seems to suggest that it blocks...
It seems that asyncio.wait_for will do the trick
https://docs.python.org/3/library/asyncio-task.html#timeouts
But how do i send in the list of awaitables rather than just an awaitable?
What I want is that any task which takes > timeout time cancels and I proceed with what I have.
This is straightforward to achieve with asyncio.wait():
# Wait for tasks to finish, but no more than a second.
done, pending = await asyncio.wait(tasks, timeout=1)
# Cancel the ones not done by now.
for fut in pending:
fut.cancel()
# Results are available as x.result() on futures in `done`
Which seems to suggest that [asyncio.wait] blocks...
It only blocks the current coroutine, the same as gather or wait_for.

Why loop.run_forever() is locking my main thread?

While learning asyncio I was trying this code:
import asyncio
from asyncio.coroutines import coroutine
#coroutine
def coro():
counter: int = 0
while True:
print("Executed" + str(counter))
counter += 1
yield
loop = asyncio.get_event_loop()
loop.run_until_complete(coro())
loop.run_forever()
print("Finished!")
I was expecting the coroutine to be executed only once because it contains a yield and should have returned control to the caller. The output I was expecting was:
Executed 0
Finished!
I was expecting this behaviour because I thought the loop was going to run the coroutine forever once every "frame" returning to the caller after each execution (something like a background thread but in a cooperative way). But instead, it runs the coroutine forever without returning?. Output is the following:
Executed 0
Executed 1
Executed 2
Executed 3
...
Could anyone explain why this happens instead of my expectations?
Cheers.
You have a couple of problems. When you call run_until_complete, it waits for coro to finish before moving on to your run_forever call. As you've defined it, coro never finishes. It contains an infinite loop that does nothing to break out of the loop. You need a break or a return somewhere inside the loop if you want to move on to the next step in your application.
Once you've done that, though, your next call is to run_forever, which, just as its name suggests, will run forever. And in this case it won't have anything to do because you've scheduled nothing else with the event loop.
I was expecting the coroutine to be executed only once because it contains a yield and should have returned control to the caller.
Looking past the fact that your coroutine has no yield, awaiting (or yielding from depending on which syntax you choose to use) does not return control to the caller of run_until_complete or run_forever. It returns control to the event loop so that it can check for anything else that has been awaited and is ready to resume.

Resources