Why yield() and cond_resched() do not yield to another thread

Why yield() and cond_resched() do not yield to another thread - linux-kernel

Why riffa_driver.c yield() function does not yield to chnl_send() thread ?

Related

Pytest a function which invokes asyncio coroutines

I am trying to write a unit testcase using mock and pytest-asyncio. I have a normal function launching an asyncio coroutine using asyncio.run. [Using python3.7]
import asyncio
async def sample_async(arg2):
# do something
proc = await asyncio.create_subprocess_shell(arg2)
# some more asyncio calls
rc = proc.returncode
return rc
def launcher(arg1, arg2):
if arg1 == "no_asyncio":
# do something
print("No asyncio")
return 0
else:
return_code = asyncio.run(sample_async(arg2))
# do something
return return_code
I was able to write unittest for the asyncio function sample_async but not for launcher. Here is what I have tried:
class AsyncMock(MagicMock):
async def __call__(self, *args, **kwargs):
return super(AsyncMock, self).__call__(*args, **kwargs)
#patch("asyncio.create_subprocess_shell", new_callable=AsyncMock)
def test_launcher(async_shell):
arg1 = "async"
arg2 = "/bin/bash ls"
sample.launcher(arg1, arg2)
async_shell.assert_called_once()
When I try to run the pytest, I keep getting RuntimeError: There is no current event loop in thread 'MainThread'. I can't use #pytest.mark.asyncio for this test as the function launcher is not an asyncio coroutine. What am I doing wrong here?

See here about explicitly creating an event loop:
RuntimeError: There is no current event loop in thread in async + apscheduler
See the documentation for get_event_loop() in particular here:
https://docs.python.org/3/library/asyncio-eventloop.html
If there is no current event loop set in the current OS thread, the OS thread is main, and set_event_loop() has not yet been called, asyncio will create a new event loop and set it as the current one.
Likely, because your OS thread is not main, you need to do some manual init'ing of the loop. Make sure you have a clear mental picture of the event loop and what these functions do in its lifecycle, think of the loop as nothing more than an object with methods and other objects (tasks) it works on. As the docs say, the main OS thread does some "magic" for you that isn't really magic just "behind the scenes".

Python asyncio - Increase the value of Semaphore

I am making use of aiohttp in one of my projects and would like to limit the number of requests made per second. I am using asyncio.Semaphore to do that. My challenge is I may want to increase/decrease the number of requests allowed per second.
For example:
limit = asyncio.Semaphore(10)
async with limit:
async with aiohttp.request(...)
...
await asyncio.sleep(1)
This works great. That is, it limits that aiohttp.request to 10 concurrent requests in a second. However, I may want to increase and decrease the Semaphore._value. I can do limit._value = 20 but I am not sure if this is the right approach or there is another way to do that.

Accessing the private _value attribute is not the right approach for at least two reasons: one that the attribute is private and can be removed, renamed, or change meaning in a future version without notice, and the other that increasing the limit won't be noticed by a semaphore that already has waiters.
Since asyncio.Semaphore doesn't support modifying the limit dynamically, you have two options: implementing your own Semaphore class that does support it, or not using a Semaphore at all. The latter is probably easier as you can always replace a semaphore-enforced limit with a fixed number of worker tasks that receive jobs through a queue. Assuming you currently have code that looks like this:
async def fetch(limit, arg):
async with limit:
# your actual code here
return result
async def tweak_limit(limit):
# here you'd like to be able to increase the limit
async def main():
limit = asyncio.Semaphore(10)
asyncio.create_task(tweak_limit(limit))
results = await asyncio.gather(*[fetch(limit, x) for x in range(1000)])
You could express it without a semaphore by creating workers in advance and giving them work to do:
async def fetch_task(queue, results):
while True:
arg = await queue.get()
# your actual code here
results.append(result)
queue.task_done()
async def main():
# fill the queue with jobs for the workers
queue = asyncio.Queue()
for x in range(1000):
await queue.put(x)
# create the initial pool of workers
results = []
workers = [asyncio.create_task(fetch_task(queue, results))
for _ in range(10)]
asyncio.create_task(tweak_limit(workers, queue, results))
# wait for workers to process the entire queue
await queue.join()
# finally, cancel the now-idle worker tasks
for w in workers:
w.cancel()
# results are now available
The tweak_limit() function can now increase the limit simply by spawning new workers:
async def tweak_limit(workers, queue, results):
while True:
await asyncio.sleep(1)
if need_more_workers:
workers.append(asyncio.create_task(fetch_task(queue, results)))

Using workers and queues is a more complex solution, you have to think about issues like setup, teardown, exception handling and backpressure, etc.
Semaphore can be implemented with Lock, if you don't mind abit of inefficiency (you will see why), here's a simple implemention for a dynamic-value semaphore:
class DynamicSemaphore:
def __init__(self, value=1):
self._lock = asyncio.Lock()
if value < 0:
raise ValueError("Semaphore initial value must be >= 0")
self.value = value
async def __aenter__(self):
await self.acquire()
return None
async def __aexit__(self, exc_type, exc, tb):
self.release()
def locked(self):
return self.value == 0
async def acquire(self):
async with self._lock:
while self.value <= 0:
await asyncio.sleep(0.1)
self.value -= 1
return True
def release(self):
self.value += 1

How to convert event-based communication into async/await using asyncio.Future

I'm trying to expose an event-based communication as a coroutine. Here is an example:
class Terminal:
async def start(self):
loop = asyncio.get_running_loop()
future = loop.create_future()
t = threading.Thread(target=self.run_cmd, args=future)
t.start()
return await future
def run_cmd(self, future):
time.sleep(3) # imitating doing something
future.set_result(1)
But when I run it like this:
async def main():
t = Terminal()
result = await t.start()
print(result)
asyncio.run(main())
I get the following error: RuntimeError: await wasn't used with future
Is it possible to achieve the desired behavior?

There are two issues with your code. One is that the args argument to the Thread constructor requires a sequence or iterable, so you need to write wrap the argument in a container, e.g. args=(future,). Since future is iterable (for technical reasons unrelated to this use case), args=future is not immediately rejected, but leads to the misleading error later down the line.
The other issue is that asyncio objects aren't thread-safe, so you cannot just call future.set_result from another thread. This causes the test program to hang even after fixing the first issue. The correct way to resolve the future from another thread is through the call_soon_threadsafe method on the event loop:
class Terminal:
async def start(self):
loop = asyncio.get_running_loop()
future = loop.create_future()
t = threading.Thread(target=self.run_cmd, args=(loop, future,))
t.start()
return await future
def run_cmd(self, loop, future):
time.sleep(3)
loop.call_soon_threadsafe(future.set_result, 1)
If your thread is really just calling a blocking function whose result you're interested in, consider using run_in_executor instead of manually spawning threads:
class Terminal:
async def start(self):
loop = asyncio.get_running_loop()
return await loop.run_in_executor(None, self.run_cmd)
# Executed in a different thread; `run_in_executor` submits the
# callable to a thread pool, suspends the awaiting coroutine until
# it's done, and transfers the result/exception back to asyncio.
def run_cmd(self):
time.sleep(3)
return 1

Asyncio task vs coroutine

Reading the asyncio documentation, I realize that I don't understand a very basic and fundamental aspect: the difference between awaiting a coroutine directly, and awaiting the same coroutine when it's wrapped inside a task.
In the documentation examples the two calls to the say_after coroutine are running sequentially when awaited without create_task, and concurrently when wrapped in create_task. So I understand that this is basically the difference, and that it is quite an important one.
However what confuses me is that in the example code I read everywhere (for instance showing how to use aiohttp), there are many places where a (user-defined) coroutine is awaited (usually in the middle of some other user-defined coroutine) without being wrapped in a task, and I'm wondering why that is the case. What are the criteria to determine when a coroutine should be wrapped in a task or not?

What are the criteria to determine when a coroutine should be wrapped in a task or not?
You should use a task when you want your coroutine to effectively run in the background. The code you've seen just awaits the coroutines directly because it needs them running in sequence. For example, consider an HTTP client sending a request and waiting for a response:
# these two don't make too much sense in parallel
await session.send_request(req)
resp = await session.read_response()
There are situations when you want operations to run in parallel. In that case asyncio.create_task is the appropriate tool, because it turns over the responsibility to execute the coroutine to the event loop. This allows you to start several coroutines and sit idly while they execute, typically waiting for some or all of them to finish:
dl1 = asyncio.create_task(session.get(url1))
dl2 = asyncio.create_task(session.get(url2))
# run them in parallel and wait for both to finish
resp1 = await dl1
resp2 = await dl2
# or, shorter:
resp1, resp2 = asyncio.gather(session.get(url1), session.get(url2))
As shown above, a task can be awaited as well. Just like awaiting a coroutine, that will block the current coroutine until the coroutine driven by the task has completed. In analogy to threads, awaiting a task is roughly equivalent to join()-ing a thread (except you get back the return value). Another example:
queue = asyncio.Queue()
# read output from process in an infinite loop and
# put it in a queue
async def process_output(cmd, queue, identifier):
proc = await asyncio.create_subprocess_shell(cmd)
while True:
line = await proc.readline()
await queue.put((identifier, line))
# create multiple workers that run in parallel and pour
# data from multiple sources into the same queue
asyncio.create_task(process_output("top -b", queue, "top")
asyncio.create_task(process_output("vmstat 1", queue, "vmstat")
while True:
identifier, output = await queue.get()
if identifier == 'top':
# ...
In summary, if you need the result of a coroutine in order to proceed, you should just await it without creating a task, i.e.:
# this is ok
resp = await session.read_response()
# unnecessary - it has the same effect, but it's
# less efficient
resp = await asyncio.create_task(session.read_reponse())
To continue with the threading analogy, creating a task just to await it immediately is like running t = Thread(target=foo); t.start(); t.join() instead of just foo() - inefficient and redundant.

Does await guarantee execution order?

Consider a single-threaded Python program. A coroutine named "first" is blocked on I/O. The subsequent instruction is "await second." Is the coroutine "second" guaranteed to execute immediately until it blocks on I/O? Or, can "first" resume executing (due to the I/O operation completing) before "second" is invoked?

Asyncio implemented a way that second would start executing until it would return control to event loop (it usually happens when it reaches some I/O operation) and only after it first can be resumed. I don't think it somehow guaranteed to you, but hardly believe this implementation will be changed either.
If for some reason you don't want first to resume executing until some part of second reached, it's probably better explicitly to use Lock to block first from executing before moment you want.
Example to show when control returns to event loop and execution flow can be changed:
import asyncio
async def async_print(text):
print(text)
async def first():
await async_print('first 1')
await async_print('first 2')
await asyncio.sleep(0) # returning control to event loop
await async_print('first 3')
async def second():
await async_print('second 1')
await async_print('second 2')
await asyncio.sleep(0) # returning control to event loop
await async_print('second 3')
async def main():
asyncio.ensure_future(first())
asyncio.ensure_future(second())
await asyncio.sleep(1)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())
loop.close()
Output:
first 1
first 2
second 1
second 2
first 3
second 3

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Why yield() and cond_resched() do not yield to another thread - linux-kernel

Why riffa_driver.c yield() function does not yield to chnl_send() thread ?

Related

Pytest a function which invokes asyncio coroutines

Python asyncio - Increase the value of Semaphore

How to convert event-based communication into async/await using asyncio.Future

Asyncio task vs coroutine

Does await guarantee execution order?

Categories

Resources