Is it possible to specify the process priority for an Ansible task?
The use case is setting a low priority for an expensive and long-running backup task. In a bash script I'd use nice for this. I did not find anything by searching using keywords "process priority" and "nice" combined with "Ansible".
async tasks allow you to run tasks in background. This helps in avoiding long-running tasks from blocking remaining tasks. The approach works as long as the remaining tasks are independent of the task marked async, this can reduce wait time.
For example, waiting for huge file to complete download and the next task is c completely independent command which can take some time. Since async task will run in the background by the time it is completed the rest of the independent commands are done.
Link on documentation below
https://docs.ansible.com/ansible/latest/user_guide/playbooks_async.html
Related
Having trouble to understand the difference between the async vs forks vs serial in Ansible.
I feel they almost do the same job. Found the below from google,
serial: Decides the number of nodes process in each tasks in a single run.
forks: Maximum number of simultaneous connections Ansible made on each Task.
async: to run multiple tasks in a playbook concurrently
Serial sets a number, a percentage, or a list of numbers of hosts you want to manage at a time.
Async triggers Ansible to run the task in the background which can be checked (or) followed up later, and its value will be the maximum time that Ansible will wait for that particular Job (or) task to complete before it eventually times out or complete.
Ansible works by spinning off forks of itself and talking to many remote systems independently. The forks parameter controls how many hosts are configured by Ansible in parallel. By default, the forks parameter in Ansible is a very conservative 5.This means that only 5 hosts will be configured at the same time, and it's expected that every user will change this parameter to something more suitable for their environment. A good value might be 25 or even 100.
SERIAL : Decides the number of nodes process in each tasks in a single run.
Use: When you need to provide changes as batches/ rolling changes.
FORKS : Maximum number of simultaneous connections Ansible made on each Task.
Use: When you need to manage how many nodes should get affected simultaneously.
For example I have 5 control tasks and 100 tasks and have mixing settings 4 task and 1 control.
What happens when the worker (aka toloker) just seen the last one, while there are still more potential task suites to work on?
The worker be kicked out of the "pool" !So only 5 task for every worker!
If the worker has seen all of the control tasks in the pool, he will not be able to complete any more of the task suits and will be notified that the tasks are finished.
However, if you create a separate pool with the exact same control tasks, the system will consider those as new control tasks and can show them to the same tolokers (But it could affect quality) I suggest create more control tasks from verified answers from prev runs
The update by query docs says that with wait_for_completion=false a task will get created, to track progress, and that the task api should be used clean up the tasks afterwards.
What is the consequence of never cleaning up these old tasks, or doing so very infrequently? Is the cost only the disk space these task files take up?
Yes, it's not a big deal if you don't cleanup those tasks immediately. The .tasks index usually has one primary shard, which allows you to spawn up to 2B tasks (= 2^31, i.e. maximum number of docs per shard) before getting into trouble.
If you use them to keep track of your tasks, it's better to clean them up once they are done, otherwise you might end up with a mess of finished task documents that are not easy to sort out.
That can also be taken care of by a simple cron job that periodically runs
DELETE .tasks/_delete_by_query?q=*
Which is the better to write a "daemon" based on oracle schedules:
The one that is run once and then is in infinite loop and sleeps for 5 seconds if there is nothing to do (to not waste CPU cycles).
The one that is started, checked if it is something to do. If not - ends execution and is run after 5 seconds by schedule.
Which one and why do you prefer? Or may be it is some another implementation?
I personally prefer an infinite loop to a scheduled task. With an infinite loop you can see a broader cross-activation overview - Eg You can count number of failures in a row/similar very easily and add error-recovery.
A scheduled task is effectively stateless unless you manually give it state (File/Db/???)
It sounds like you might want to look at using an a queue to do the processing rather than a schedule job. The process can block on the queue waiting for new work.
I would like to run a script when all of the jobs that I have sent to a server are done.
for example, I send
ssh server "for i in config*; do qsub ./run 1 $i; done"
And I get back a list of the jobs that were started. I would like to automatically start another script on the server to process the output from these jobs once all are completed.
I would appreciate any advice that would help me avoid the following inelegant solution:
If I save each of the 1000 job id's from the above call in a separate file, I could check the contents of each file against the current list of running jobs, i.e. output from a call to:
ssh qstat
I would only need to check every half hour, but I would imagine that there is a better way.
It depends a bit on what job scheduler you are using and what version, but there's another approach that can be taken too if your results-processing can also be done on the same queue as the job.
One very handy way of managing lots of related job in more recent versions of torque (and with grid engine, and others) is to launch the any individual jobs as a job array (cf. http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qsub.htm#-t). This requires mapping the individual runs to numbers somehow, which may or may not be convenient; but if you can do it for your jobs, it does greatly simplify managing the jobs; you can qsub them all in one line, you can qdel or qhold them all at once (while still having the capability to deal with jobs individually).
If you do this, then you could submit an analysis job which had a dependency on the array of jobs which would only run once all of the jobs in the array were complete: (cf. http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qsub.htm#dependencyExamples). Submitting the job would look like:
qsub analyze.sh -W depend=afterokarray:427[]
where analyze.sh had the script to do the analysis, and 427 would be the job id of the array of jobs you launched. (The [] means only run after all are completed). The syntax differs for other schedulers (eg, SGE/OGE) but the ideas are the same.
Getting this right can take some doing, and certainly Tristan's approach has the advantage of being simple, and working with any scheduler; but learning to use job arrays in this situation if you'll be doing alot of this may be worth your time.
Something you might consider is having each job script just touch a filename in a dedicated folder like $i.jobdone, and in your master script, you could simply use ls *.jobdone | wc -l to test for the right number of jobs done.
You can use wait to stop execution until all your jobs are done. You can even collect all the exit statuses and other running statistics (time it took, count of jobs done at the time, whatever) if you cycle around waiting for specific ids.
I'd write a small C program to do the waiting and collecting (if you have permissions to upload and run executables), but you can easily use the bash wait built-in for roughly the same purpose, albeit with less flexibility.
Edit: small example.
#!/bin/bash
...
waitfor=''
for i in tasks; do
task &
waitfor="$waitfor $!"
done
wait $waitfor
...
If you run this script in background, It won't bother you and whatever comes after the wait line will run when your jobs are over.