Using wait_for with timeouts with list of tasks - python-asyncio

So, I have a list of tasks which I want to schedule concurrently in a non-blocking fashion.
Basically, gather should do the trick.
Like
tasks = [ asyncio.create_task(some_task()) in bleh]
results = await asyncio.gather(*tasks)
But then, I also need a timeout. What I want is that any task which takes > timeout time cancels and I proceed with what I have.
I fould asyncio.wait primitive.
https://docs.python.org/3/library/asyncio-task.html#waiting-primitives
But then the doc says:
Run awaitable objects in the aws set concurrently and block until the condition specified by return_when.
Which seems to suggest that it blocks...
It seems that asyncio.wait_for will do the trick
https://docs.python.org/3/library/asyncio-task.html#timeouts
But how do i send in the list of awaitables rather than just an awaitable?

What I want is that any task which takes > timeout time cancels and I proceed with what I have.
This is straightforward to achieve with asyncio.wait():
# Wait for tasks to finish, but no more than a second.
done, pending = await asyncio.wait(tasks, timeout=1)
# Cancel the ones not done by now.
for fut in pending:
fut.cancel()
# Results are available as x.result() on futures in `done`
Which seems to suggest that [asyncio.wait] blocks...
It only blocks the current coroutine, the same as gather or wait_for.

Related

what is the best way to get notified when a task finishes, in F#?

I have a pool of tasks and I am trying to figure out the best way to be notified, through an event, when one is finished.
Since the tasks are quite varied, I don't want to add a piece of code inside the task itself since that would mean putting it in several places. These are long running tasks, I'm not waiting for them to complete anywhere, they're just getting started, do their work (minutes to days) and then they finish.
The ugly-but-could-work solution is to wrap each work task into another task that awaits for the work task to be complete and then sends an event, but I'm hoping there would be something more elegant.
In a comment you explained that you're starting your tasks like this:
Async.StartAsTask (runner.Start(), TaskCreationOptions.LongRunning, cancellationSource.Token)
Instead of doing that, start them like this:
startMyTask runner cancellationSource (fun() -> printfn "Task completed!")
Where:
let startMyTask (runner: RunnerType) (s: CancellationTokenSource) onDone =
let wrapper = async {
do! runner.Start()
onDone()
}
Async.StartAsTask (wrapper, TaskCreationOptions.LongRunning, s.Token)

Why am I not allowed to break a Promise?

The following simple Promise is vowed and I am not allowed to break it.
my $my_promise = start {
loop {} # or sleep x;
'promise response'
}
say 'status : ', $my_promise.status; # status : Planned
$my_promise.break('promise broke'); # Access denied to keep/break this Promise; already vowed
# in block <unit> at xxx line xxx
Why is that?
Because the Promise is vowed, you cannot change it: only something that actually has the vow, can break the Promise. That is the intent of the vow functionality.
What are you trying to achieve by breaking the promise as you showed? Is it to stop the work being done inside of the start block? Breaking the Promise would not do that. And the vow mechanism was explicitly added to prevent you from thinking it can somehow stop the work inside a start block.
If you want work inside a start block to be interruptible, you will need to add some kind of semaphore that is regularly checked, for instance:
my int $running = 1;
my $my_promise = start {
while $running {
# do stuff
}
$running
}
# do other stuff
$running = 0;
await $my_promise;
Hope this made sense.
The reason why you cannot directly keep/break Promise from outside or stop it on Thread Pool are explained here in Jonathans comment.
Common misuse of Promises comes from timeout pattern.
await Promise.anyof(
start { sleep 4; say "finished"; },
Promise.in( 1 )
);
say "moving on...";
sleep;
This will print "finished". And when user realize that the next logical step for him is to try to kill obsolete Promise. While the only correct way to solve it is to make Promise aware that its work is no longer needed. For example through periodically checking some shared variable.
Things gets complicated if you have blocking code on Promise (for example database query) that runs for too long and you want to terminate it from main thread. That is not doable on Promises. All you can do is to ensure Promise will run in finite time (for example on MySQL by setting MAX_EXECUTION_TIME before running query). And then you have choice:
You can grind your teeth and patiently wait for Promise to finish. For example if you really must disconnect database in main thread.
Or you can move on immediately and allow "abandoned" Promise to finish on its own, without ever receiving its result. In this case you should control how many of those Promises can stack up in background by using Semaphore or running them on dedicated ThreadPoolScheduler.

stop specfic process in python ProcessPoolExecutor or shared state btw them

This is my code
def long_stage_task(node, deployment_folder_name, stage_s3_bucket):
global workers
logging.info("starting....")
work = StageOS(node, deployment_folder_name, stage_s3_bucket)--> class
work.stagestart()--> class method
executor = ProcessPoolExecutor(5)
executor.submit(long_stage_task, i, deployment_folder_name, stage_s3_bucket)
Now how can i stop a particular process/pid.
Is there any way to pass globals or shared state btw them, i don't see any thing in the doc.
https://docs.python.org/3/library/concurrent.futures.html
You could pass to the workers a list of Events and set them when you want the worker to stop. This implies your long_stage_task function periodically checks its own Event.
If what you are after is stopping a task which is taking too long, you can take a look at pebble. It allows to set timeouts to function calls as well as to cancel ongoing tasks.

How to perform asynchronous tasks in fixed interval of time

The goal is to perform an async task(file read, network operation) without blocking the code. And we have multiple such async tasks that need to be executed at a fixed interval of times. Here is a pseudo code to demonstrate the same.
# the async tasks should be performed in parallel
# provide me with a return value after the task is complete, or they can have a callback or any other mechanism of communication
async_task_1 = perform_async(1)
# now I need to wait fix amount of time before the async task 2
sleep(5)
# this also similar to the tasks one in nature
async_task_2 = perform_async(2)
# finally do something with the result
I'm reading that in ruby I've 2 options forking, threading. The is also something called as Fiber. I also read that due to GIL in the basic Ruby, I won't be able to make much use of threading. I still want to stick to the base Ruby.
I've written some parallel code previously in OMP and Cuda. But I've never got a chance to do that in Ruby.
Can you suggest how to achieve this?
I would recommend to you the concurrent-ruby gem with its async feature. This will work great, as long as your tasks are IO bound. (As you said they are)
There you have a async feature to perform your tasks. To wait the amount of time between your 2 async calls you can use literally the sleep function
class AsyncCalls
include Concurrent::Asnyc
def perform_task(params)
# IO bound task
end
end
AsyncCalls.new.async.perform_task("param")
sleep 5
AsyncCalls.new.async.perform_task("other param")

Call EventMachine defer within callback?

I'm using EventMachine.defer to handle some long-running processes (an indefinite wait for a response from an outside application). I want to do this in a loop: each time the application responds, I process the response and then immediately want to start waiting for the next response.
My code currently looks like this:
def watch_for_songs_change
EM.defer(
->( ){ `mpc idle playlist` }, # wait for the song list to change
->(_){ update_songs; watch_for_songs_change }
)
end
I realized that this is calling defer from within a callback from defer. Is this valid? Am I spawning one thread from inside another, and will eventually run out of threads? Or does EventMachine invoke the callback after it has returned the thread to the pool?
I've tried to chain calls like this before in EM, and found that using periodic timers is a usually a better design.
#timer = EventMachine.add_periodic_timer( 1 ) { `mpc idle playlist` and update_songs }

Resources