Having trouble to understand the difference between the async vs forks vs serial in Ansible.
I feel they almost do the same job. Found the below from google,
serial: Decides the number of nodes process in each tasks in a single run.
forks: Maximum number of simultaneous connections Ansible made on each Task.
async: to run multiple tasks in a playbook concurrently
Serial sets a number, a percentage, or a list of numbers of hosts you want to manage at a time.
Async triggers Ansible to run the task in the background which can be checked (or) followed up later, and its value will be the maximum time that Ansible will wait for that particular Job (or) task to complete before it eventually times out or complete.
Ansible works by spinning off forks of itself and talking to many remote systems independently. The forks parameter controls how many hosts are configured by Ansible in parallel. By default, the forks parameter in Ansible is a very conservative 5.This means that only 5 hosts will be configured at the same time, and it's expected that every user will change this parameter to something more suitable for their environment. A good value might be 25 or even 100.
SERIAL : Decides the number of nodes process in each tasks in a single run.
Use: When you need to provide changes as batches/ rolling changes.
FORKS : Maximum number of simultaneous connections Ansible made on each Task.
Use: When you need to manage how many nodes should get affected simultaneously.
Related
I want to maintain a state file to keep track of whether a role ran or not. Ansible's retry file does not make sense as multiple roles that I have in the playbook are calling bunch of different APIs.
In a multi DC setup, a given playbook is iterated. If something fails and playbook exits out, using Ansible's default retry file didnt resumed where it should have resumed from.
I wanted to know if the meta/main.yml can somehow read a dynamic state file that keeps track of DC and role...maybe we can read and determine if the given role can execute or not. We can definetely put a bunch of when conditionals for every tasks in the main.yml of the role. Is there a better way?
... Ansible's default retry file didn't resumed where it should have resumed from.
The retry files indicate the hosts that failed an Ansible execution, those would have a similar effect as the state files you mentioned. As each execution is treated independent and in atomic way, you would need to execute the playbook (or role) to ensure that variables and task state are in memory to ensure a consistent result.
Ansible playbooks are expected to be idempotent, you should be able to execute them several times, and the resulting state should be the same.
Have you identified why the execution failed in those hosts?
For example I have 5 control tasks and 100 tasks and have mixing settings 4 task and 1 control.
What happens when the worker (aka toloker) just seen the last one, while there are still more potential task suites to work on?
The worker be kicked out of the "pool" !So only 5 task for every worker!
If the worker has seen all of the control tasks in the pool, he will not be able to complete any more of the task suits and will be notified that the tasks are finished.
However, if you create a separate pool with the exact same control tasks, the system will consider those as new control tasks and can show them to the same tolokers (But it could affect quality) I suggest create more control tasks from verified answers from prev runs
Is it possible to specify the process priority for an Ansible task?
The use case is setting a low priority for an expensive and long-running backup task. In a bash script I'd use nice for this. I did not find anything by searching using keywords "process priority" and "nice" combined with "Ansible".
async tasks allow you to run tasks in background. This helps in avoiding long-running tasks from blocking remaining tasks. The approach works as long as the remaining tasks are independent of the task marked async, this can reduce wait time.
For example, waiting for huge file to complete download and the next task is c completely independent command which can take some time. Since async task will run in the background by the time it is completed the rest of the independent commands are done.
Link on documentation below
https://docs.ansible.com/ansible/latest/user_guide/playbooks_async.html
I am working with MPI, and I have a certain hierarchy of operations. For a particular value of a parameter _param, I launch 10 trials, each running a specific process on a distinct core. For n values of _param, the code runs in a certain hierarchy as:
driver_file ->
launches one process which checks if available processes are more than 10. If more than 10 are available, then it launches an instance of a process with a specific _param value passed as an argument to coupling_file
coupling_file ->
does some elementary computation, and then launches 10 processes using MPI_Comm_spawn(), each corresponding to a trial_file while passing _trial as an argument
trial_file ->
computes work, returns values to the coupling_file
I am facing two dilemmas, namely:
How do I evaluate the required condition for the cores in driver_file?
As in, how do I find out how many processes have been terminated, so that I can correctly schedule processes on idle cores? I thought maybe adding a blocking MPI_Recv() and use it to pass a variable which would tell me when a certain process has been finished, but I'm not sure if this is the best solution.
How do I ensure that processes are assigned to different cores? I had thought about using something like mpiexec --bind-to-core --bycore -n 1 coupling_file to launch one coupling_file. This will be followed by something like mpiexec --bind-to-core --bycore -n 10 trial_file
launched by the coupling_file. However, if I am binding processes to a core, I don't want the same core to have two/more processes. As in, I don't want _trial_1 of _coupling_1 to run on core x, then I launch another process of coupling_2 which launches _trial_2 which also gets bound to core x.
Any input would be appreciated. Thanks!
If it is an option for you, I'd drop the spawning processes thing altogether, and instead start all processes at once.
You can then easily partition them into chunks working on a single task. A translation of your concept could for example be:
Use one master (rank 0)
Partition the rest into groups of 10 processes, maybe create a new communicator for each group if needed, each group has one leader process, known to the master.
In your code you then can do something like:
if master:
send a specific _param to each group leader (with a non-blocking send)
loop over all your different _params
use MPI_Waitany or MPI_Waitsome to find groups that are ready
else
if groupleader:
loop endlessly
MPI_Recv _params from master
coupling_file
MPI_Bcast to group
process trial_file
else
loop endlessly
MPI_BCast (get data from groupleader)
process trial file
I think, following this approach would allow you to solve both your issues. Availability of process groups gets detected by MPI_Wait*, though you might want to change the logic above, to notify the master at the end of your task so it only sends new data then, not already during the previous trial is still running, and another process group might be faster. And pinning is resolved as you have a fixed number of processes, which can be properly pinned during the usual startup.
It seems that all jobs are enqueued, and only one will run at a time. How can we run more than one?
Tower is designed to parallelize jobs, but there are a couple of cases where it will not.
If you have your inventory or SCM set up to "update on launch" with no cache or the cahche has expired, then any additional jobs will be stuck pending behind the inventory or SCM update. The inventory and SCM will not update until after the currently running job is done.
If you are trying to run multiple jobs against the same host: Tower will not run multiple jobs against the same host at the same time in order to avoid race conditions. (localhost is a possible exception). If you need multiple jobs to run against the same host at the same time then you need to create two inventories and put that host in both inventories, running the two jobs against different inventories. In this situation, Tower does not know that you are running against the same host.
Jobs which share the same Inventory or SCM source can not run at the same time.
Suppose you have a job comprised of three tasks:
task 1: "do x", task 2: "do y", task 3: "do z"
With ansible "do x" will run on all the servers, then "do y" will run on all the servers, then "do z" will run on all the servers.
Also, I said "all serves" but in fact it maxes out at the ansible "forks" value, which defaults to 5. In my 100 server enviroment I set this value to 20. more on this here: http://docs.ansible.com/intro_configuration.html#forks
Remember the strength of ansible is doing a job ( a collection of tasks) on many machines at the same time. If what you want is to run the same task many times on a single machine, then you want something like fork, or parallel.
In fact Ansible will try to run "do x" as many times as it can across many machines. You can adjust this behavior having the whole job run on a portion of machines before it gets started on more machines with the "serial" keyword (http://docs.ansible.com/playbooks_delegation.html#rolling-update-batch-size).
Not the subtle difference between forks, and serial.
forks is "per task"
serial is "per job" ( collection of tasks )
David Thornton
Edit:I re-read your question. This is about running more than one job at a time, not running more than on task in a job. So I think you are correct for ansible-awx but not for the command line. Via the web interface you can submit a job to the job queue, but you can't make ansible-awx run more than one task at a time. I think. However via command line, if you open more than one window you can run multiple ansible-playbooks at the same time. Do you have an ansible support account? Those guys are great IMHO, they have taken a lot of time to answer my questions ( like your question ).
Simultaneous jobs can be executed from Tower. Job templates have "Enable Concurrent Jobs" option. See section "15.4. Job Concurrency" at http://docs.ansible.com/ansible-tower/latest/html/userguide/jobs.html.
If i have 3 different tasks on a single server running its called synchronous mode management, 3 tasks will be assigned to a single job ID , and each tasks executes one after the other were it consumes lots of time.
In Ansible version later than 2.5 we can get 3 job ID for 3 different tasks , and start executing at a same time were we can save a huge time.This type is called asynchronous mode.