Async tasks returning no results - async-await

I have around 10-15 ecto queries which I want to run async in my API code.
I am using Task.async and Task.yeild_many
Following is the code for async task -
def get_tasks() do
task_1 =
Task.async(SomeModule, :some_function, [
param_1,
param_2
])
task_2 =
Task.async(SomeModule, :some_function, [
param_1,
param_2
])
task_3 =
Task.async(SomeModule, :some_function, [
param_1,
param_2
])
[task_1, task_2, task_3]
end
I get the tasks in my main function as -
[
{_, task_1},
{_, task_2},
{_, task_3}
] =
[
task_1,
task_2,
task_3,
]
|> MyCode.TaskHelper.yeild_multiple_tasks()
And my task helper code is given below -
defmodule MyCode.TaskHelper do
def get_results_or_shutdown(tasks_with_results) do
Enum.map(tasks_with_results, fn {task, res} ->
res || Task.shutdown(task, :brutal_kill)
end)
end
#doc """
Returns the result of multiple tasks ran parallely
## Parameters
- task_list: list, a list of all tasks
"""
def yeild_multiple_tasks(task_list) do
task_list
|> Task.yield_many()
|> get_results_or_shutdown()
end
end
Each tasks are ecto queries.
The issue is the tasks are behaving randomly. Sometimes they return results, sometimes they don't. But there was no time when all the tasks have return results (for the representational purpose I have written 3 tasks,
but I have around 10-15 async tasks).
I ran the code synchronously, and it returned the correct results (obviously). I tried changing the pool_size in config to 50 for Repo, but to no avail.
Can someone please help me out with this? I am quite stuck here.

Related

Is there a way to check for another auto complete that was already auto completed, i.e checking if an opition was x in another auto complete for y

Specifically, I'm trying to make a client sided Discord music player and require it to play specific albums (for organization, and also high quality audio) and cannot figure out for the life of me of how to get the variable from another autocomplete that was already completed beforew the 2nd autocomplete.
I tried checking specifically for the first auto complete using #bot.tree.varName which obviously didnt work, and namespace doesn't seem like there is no way to specifically find a slash command auto complete.
In other words, :
#First Auto Complete
#playalbum.autocomplete("album")
async def playalbum_autocomplete(
interaction: discord.Interaction,
current: str,
) -> list[app_commands.Choice[str]]:
#Would define from index, but not for now.
AlbumList = ["test"]
return [
app_commands.Choice(name="album", value="album")
for album in AlbumList if current.lower() in album.lower()
]
#Starting on the next AutoComplete
#playalbum.autocomplete("song")
async def playalbum_autocomplete(
interaction: discord.Interaction,
current: str,
) -> list[app_commands.Choice[str]]:
#Would define from index, but not for now.
if app_commands.Namespace.album == "test": #Point that didn't work.
songList = ["testSong"]
return [
app_commands.Choice(name="song", value="song")
for song in songList if current.lower() in album.lower()
]

"working" terminal prompt running in parallel Python 3.10

I am trying to show an animated "working" prompt while running some python code. I've been searching for a way to do so but the solutions i've found are not quite what i want (if i recall correctly tqdm and alive-progress require a for loop with a defined number of iterations) and I'd like to find a way to code it myself.
The closest I've gotten to making it is using asyncio as follows:
async def main():
dummy_task = asyncio.create_task(dummy_search())
bar_task = asyncio.create_task(progress())
test = await dummy_task
bar_task.cancel()
where dummy task can be any async task and bar_task is:
FLUSH_LINE = "\033[K"
async def progress(mode=""):
def integers():
n = 0
while True:
yield n
n += 1
progress_indicator = ["-", "\\", "|", "/"]
message = "working"
message_len = len(message)
message += "-" * message_len
try:
if not mode:
for i in (n for n in integers()):
await asyncio.sleep(0.05)
message = message[1:] + message[0]
print(f"{FLUSH_LINE}{progress_indicator[i % 4]} [{message[:message_len]}]", end="\r")
finally:
print(f"{FLUSH_LINE}")
The only problem with this approach is that asyncio does not actually run tasks in parallel, so if the dummy_task does not use await at any point, the bar_task will not run until the dummy task is complete, and it won't show the working prompt in the terminal.
How should I go about trying to run both tasks in parallel? Do I need to use multiprocessing? If so, would both tasks write to the same terminal by default?
Thanks in advance.

How to run REST-call simultanously or with a lower priority

I am loading data via a REST call and rendering it. After that, I am calling another REST API which takes about 10 seconds. In this time, I can't make another REST call until this one is finished. My question is, how can I do this?
I tried with a thread, but it is not working, maybe I am doing something wrong, or maybe threads are not the correct choice?
This is the called route:
get '/api/dashboard/:dbnum/block/:blnum/inbackground/:inbackground' do
user = get_current_userobject
return assemble_error('LOGIN', 'NOTLOGGEDIN', {}, []).rest_fail if !user
dbnum,blnum = params[:dbnum].to_i, params[:blnum].to_i
return { rows: [] }.rest_success if !user.dashboardinfo || !user.dashboardinfo[dbnum] || !user.dashboardinfo[dbnum]['blocks'] || !(block = user.dashboardinfo[dbnum]['blocks'][blnum]) || !respond_to?("dashboard_type_#{block['type']}", true)
if params[:inbackground] == 'true'
t = Thread.new do
t.priority= -1
ret = method("dashboard_type_#{block['type']}").call(block['filters'], false, true)
ret.rest_success
end
t.join
t.exit
else
ret = method("dashboard_type_#{block['type']}").call(block['filters'], false, false)
ret.rest_success
end
end
How can I run the code inside line 8 to 22 in the 'background' so other calls have, like, priority?
The command t.join waits for a thread to finish. If you want your thread to run in the background, just fire and forget:
get '/api/dashboard/:dbnum/block/:blnum/inbackground/:inbackground' do
user = get_current_userobject
return assemble_error('LOGIN', 'NOTLOGGEDIN', {}, []).rest_fail if !user
dbnum,blnum = params[:dbnum].to_i, params[:blnum].to_i
return { rows: [] }.rest_success if !user.dashboardinfo || !user.dashboardinfo[dbnum] || !user.dashboardinfo[dbnum]['blocks'] || !(block = user.dashboardinfo[dbnum]['blocks'][blnum]) || !respond_to?("dashboard_type_#{block['type']}", true)
if params[:inbackground] == 'true'
t = Thread.new do
t.priority= -1
ret = method("dashboard_type_#{block['type']}").call(block['filters'], false, true)
ret.rest_success
end
else
ret = method("dashboard_type_#{block['type']}").call(block['filters'], false, false)
ret.rest_success
end
end
Of course the problem with this is that you get a bunch of dead threads building up as your server runs. And if you're working in a REST API (designed to be stateless), it might not be as simple as throwing your threads into an array and periodically cleaning them up.
Ultimately, I think, you should look into asynchronous job handlers. I've worked with sidekiq and had a decent time, but I don't have enough experience to give you a whole-hearted recommendation.

Why does using asyncio.ensure_future for long jobs instead of await run so much quicker?

I am downloading jsons from an api and am using the asyncio module. The crux of my question is, with the following event loop as implemented as this:
loop = asyncio.get_event_loop()
main_task = asyncio.ensure_future( klass.download_all() )
loop.run_until_complete( main_task )
and download_all() implemented like this instance method of a class, which already has downloader objects created and available to it, and thus calls each respective download method:
async def download_all(self):
""" Builds the coroutines, uses asyncio.wait, then sifts for those still pending, loops """
ret = []
async with aiohttp.ClientSession() as session:
pending = []
for downloader in self._downloaders:
pending.append( asyncio.ensure_future( downloader.download(session) ) )
while pending:
dne, pnding= await asyncio.wait(pending)
ret.extend( [d.result() for d in dne] )
# Get all the tasks, cannot use "pnding"
tasks = asyncio.Task.all_tasks()
pending = [tks for tks in tasks if not tks.done()]
# Exclude the one that we know hasn't ended yet (UGLY)
pending = [t for t in pending if not t._coro.__name__ == self.download_all.__name__]
return ret
Why is it, that in the downloaders' download methods, when instead of the await syntax, I choose to do asyncio.ensure_future instead, it runs way faster, that is more seemingly "asynchronously" as I can see from the logs.
This works because of the way I have set up detecting all the tasks that are still pending, and not letting the download_all method complete, and keep calling asyncio.wait.
I thought that the await keyword allowed the event loop mechanism to do its thing and share resources efficiently? How come doing it this way is faster? Is there something wrong with it? For example:
async def download(self, session):
async with session.request(self.method, self.url, params=self.params) as response:
response_json = await response.json()
# Not using await here, as I am "supposed" to
asyncio.ensure_future( self.write(response_json, self.path) )
return response_json
async def write(self, res_json, path):
# using aiofiles to write, but it doesn't (seem to?) support direct json
# so converting to raw text first
txt_contents = json.dumps(res_json, **self.json_dumps_kwargs);
async with aiofiles.open(path, 'w') as f:
await f.write(txt_contents)
With full code implemented and a real API, I was able to download 44 resources in 34 seconds, but when using await it took more than three minutes (I actually gave up as it was taking so long).
When you do await in each iteration of for loop it will await to download every iteration.
When you do ensure_future on the other hand it doesn't it creates task to download all the files and then awaits all of them in second loop.

Is it reasonable to use resque(ruby) to manage external long-running commands (and log tasks)

I have to run bash heavy-job.sh <data-num> (that takes 0.5~2 days) frequently on my computer to process data located at ~/a/data/num . The script call a few sub-processes sequentially and write a log to ~/a/result/num.log . I have done this manually until now.
I wanted to visualize processed tasks and it's status(success or fail), etc as html table. I wrote simple sinatra app to render a table that shows
the list of ~/a/data/num to be processed
~/a/result/num.log exists or not (process not-launched/processing/done)
it's status (the log file contains the word "error" or not)
I found that it would be convenient that if I could launch a bash heavy-job.sh <data-num> from the sinatra app, log the tasks (and info like time,date,etc..) and it's args (heavy-jobs takes some optional args ) and show them as html table.
So I need something that manages jobs and logs to files (or db).
First I wrote a code like below for test (! for test, not integrated with my system yet !), but later I found resque is what i wanted. I am a beginner and not sure if my decision is reasonable or not.
my questions are
is it reasonable to use resque to manage external long-running commands (and log tasks)
or should I use another tool (not necessarily ruby-tool).
(extra;) the task-manager and the sinatra app should work separately (and communicate each other over REST or something) OR not ?
The jobs are not critical since I can retry tasks manually later if failed.
I am not good at English and my question may be misleading. I appreciate any help :) .
class TaskSpawn
def initialize()
#pids = []
end
def spawn(command, options = {})
#opt = {:pgroup => true}
#pids << Kernel.spawn(command, options)
end
def pids()
return #pids.clone
end
def waitany_nohang()
delete_idx = nil
ret = nil
#pids.each_with_index do |p, idx|
pid,status = Process.waitpid2(p, Process::WNOHANG)
unless pid.nil?
delete_idx = idx
ret = [pid,status]
break
end
end
if delete_idx
#pids.delete_at(delete_idx)
return ret
else
# no task fininshed
return nil
end
end
def waitall()
ret = waitall
raise "interal error" if ret.size != pids.size
return ret
end
end

Resources