async_status loses results of the task it's cheking - ansible

I have these tasks to build aws ec2 instances in parallel:
- name: Set up testing EC2 instances
ec2_instance:
image_id: "{{ ami }}"
name: "testing {{ item }}"
tags:
Resposible Party: neil.watson#genesys.com
Purpose: Temporary shared VPC CICD testing
vpc_subnet_id: "{{ item }}"
wait: yes
register: ec2_instances
async: 7200
poll: 0
loop:
- "{{ PrivateSubnet01.value }}"
- name: Wait for instance creation to complete
async_status: jid={{ item.ansible_job_id }}
register: ec2_jobs
until: ec2_jobs.finished
retries: 300
loop: "{{ ec2_instances.results }}"
- debug:
msg: "{{ ec2_instances }}"
The trouble is that the end debug task doesn't show what I expect. I expect to see all the return values of the ec2_instance module, but instead I only see:
ok: [localhost] =>
msg:
changed: true
msg: All items completed
results:
- _ansible_ignore_errors: null
_ansible_item_label: subnet-0f69db3460b3391d1
_ansible_item_result: true
_ansible_no_log: false
_ansible_parsed: true
ansible_job_id: '814747228663.130'
changed: true
failed: false
finished: 0
item: subnet-0f69db3460b3391d1
results_file: /root/.ansible_async/814747228663.130
started: 1
Why?

"Set up testing EC2 instances" task was run asynchronously (poll: 0) and registered ec2_instances before they finish booting (finished: 0). Variable ec2_instances has not been changed afterwards. Would probably ec2_jobs ,registered after the task "Wait for instance creation to complete" had completed, keep the info you expect?

So I was looking for a solution to this exact scenario as well. Couldn't find one then figured it out on my own.
Apparently ec2_instance will return the instance of an already spun up ec2 if you provide the same name, region, and image_id.
So in my case, I had to provision 3 new instances,
I ran 3 tasks each spinning up an instance, async.
Then ran 3 async_status task to make sure all 3 instances were up. Then ran
community.aws.ec2_instance:
name: "machine_1"
region: "us-east-1"
image_id: ami-042e8287309f5df03
register: machine_1
for each of the three machines, and then stored them into my variables.

Related

Parallel execution of localhost tasks in Ansible

I'm using community.vmware.vmware_guest_powerstate collection for Ansible to start VMs.
The problem is the time it takes for 1 VM can be 2-5 sec, which makes its very inefficient when I want to start 50 VMs ...
Is there any way to make it in parallel?
The playbook:
- hosts: localhost
gather_facts: false
collections:
- community.vmware
vars:
certvalidate: "no"
server_url: "vc01.x.com"
username: "{{ lookup('ansible.builtin.env', 'API_USER', default=Undefined) }}"
password: "{{ lookup('ansible.builtin.env', 'API_PASS', default=Undefined) }}"
tasks:
- name: "setting state={{ requested_state }} in vcenter"
community.vmware.vmware_guest_powerstate:
username: "{{ lookup('ansible.builtin.env', 'API_USER', default=Undefined) }}"
password: "{{ lookup('ansible.builtin.env', 'API_PASS', default=Undefined) }}"
hostname: "{{ server_url }}"
datacenter: "DC1"
validate_certs: no
name: "{{ item }}"
state: "powered-on"
loop: "{{ hostlist }}"
This is Ansible's output: (every line can take 2-5 sec ...)
TASK [setting state=powered-on in vcenter] ************************************************************************************************************
Monday 19 September 2022 11:17:59 +0000 (0:00:00.029) 0:00:08.157 ******
changed: [localhost] => (item=x1.com)
changed: [localhost] => (item=x2.com)
changed: [localhost] => (item=x3.com)
changed: [localhost] => (item=x4.com)
changed: [localhost] => (item=x5.com)
changed: [localhost] => (item=x6.com)
changed: [localhost] => (item=x7.com)
try this instead...
- hosts: all
gather_facts: false
collections:
- community.vmware
vars:
certvalidate: "no"
server_url: "vc01.x.com"
username: "{{ lookup('ansible.builtin.env', 'API_USER', default=Undefined) }}"
password: "{{ lookup('ansible.builtin.env', 'API_PASS', default=Undefined) }}"
tasks:
- name: "setting state={{ requested_state }} in vcenter"
community.vmware.vmware_guest_powerstate:
username: "{{ username }}"
password: "{{ password }}"
hostname: "{{ server_url }}"
datacenter: "DC1"
validate_certs: no
name: "{{ inventory_hostname }}"
state: "powered-on"
delegate_to: localhost
Then run it with your hostlist as the inventory and use forks:
ansible-playbook -i x1.com,x2.com,x3.com,... --forks 10 play.yml
... the time it takes for 1 VM can be 2-5 sec, which makes its very inefficient when I want to start 50 VMs ...
Right, this is the usual behavior.
Is there any way to make it in parallel?
As already mentioned within the comments by Vladimir Botka, asynchronous actions and polling is worth a try since
By default Ansible runs tasks synchronously, holding the connection to the remote node open until the action is completed. This means within a playbook, each task blocks the next task by default, meaning subsequent tasks will not run until the current task completes. This behavior can create challenges.
You see it in your case in the task and in a loop.
Probably the Best Practice to address the use case and to eliminate the cause is to enhance the module code.
According the documentation vmware_guest_powerstate module – Manages power states of virtual machines in vCenter and source ansible-collections/community.vmware/blob/main/plugins/modules/vmware_guest_powerstate.py, the parameter name: takes one name for one VM only. If it would be possible to provide a list of VM names "{{ hostlist }}" to the module directly, there would be one connection attempt only and the loop happening one the Remote Node instead of the Controller Node (... even if this is running localhost for both cases).
To do so one would need to start with name=dict(type='list') instead of str and implement all other logic, error handling and responses.
Further Documentation
Since the community vmware_guest_powerstate module is importing and utilizing additional libraries
pyVmomi library
pyVmomi Community Samples
Meanwhile and based on
Further Q&A and Tests
How do I optimize performance of Ansible playbook with regards to SSH connections?
I've setup another short performance test to simulate the behavior you are observing
---
- hosts: localhost
become: false
gather_facts: false
tasks:
- name: Gather subdirectories
shell:
cmd: "ls -d /home/{{ ansible_user }}/*/"
warn: false
register: subdirs
- name: Gather stats (loop) async
shell: "stat {{ item }}"
loop: "{{ subdirs.stdout_lines }}"
loop_control:
label: "{{ item }}"
async: 5
poll: 0
- name: Gather stats (loop) serial
shell: "stat {{ item }}"
loop: "{{ subdirs.stdout_lines }}"
loop_control:
label: "{{ item }}"
- name: Gather stats (list)
shell: "stat {% raw %}{{% endraw %}{{ subdirs.stdout_lines | join(',') }}{% raw %}}{% endraw %}"
register: result
- name: Show result
debug:
var: result.stdout
and found that adding async will add some additional overhead resulting into even longer execution time.
Gather subdirectories ------------------------ 0.57s
Gather stats (loop) async -------------------- 3.99s
Gather stats (loop) serial ------------------- 3.79s
Gather stats (list) -------------------------- 0.45s
Show result ---------------------------------- 0.07s
This is because of the "short" runtime of the executed task in comparison to "long" time establishing a connection. As the documentation pointed out
For example, a task may take longer to complete than the SSH session allows for, causing a timeout. Or you may want a long-running process to execute in the background while you perform other tasks concurrently. Asynchronous mode lets you control how long-running tasks execute.
one may take advantage from async in case of long running processes and tasks.
In respect the given answer from #Sonclay I've performed another test with
---
- hosts: all
become: false
gather_facts: false
tasks:
- name: Gather subdirectories
shell:
cmd: "ls -d /home/{{ ansible_user }}/*/"
warn: false
register: subdirs
delegate_to: localhost
- name: Gather stats (loop) serial
shell: "stat {{ item }}"
loop: "{{ subdirs.stdout_lines }}"
loop_control:
label: "{{ item }}"
delegate_to: localhost
whereby a call with
ansible-playbook -i "test1.example.com,test2.example.com,test3.example.com" --forks 3 test.yml
will result into an execution time of
Gather subdirectories ------------------------ 0.72s
Gather stats (loop) -------------------------- 0.39s
so it seems to be worth a try.

How to register a variable to each task dynamically and add those variable in to a list so that i could use it later part of the ansible playbook?

Real scenario is,
I have n hosts in a inventory group and playbook has to run a specific *command for the specific inventory hostname(done with ansible when condition statement), but whenever the condition met and I need to register a variable for the above *command result.
so this variable creation should be done dynamically and these created variable should be appended into a list and then at end of the same playbook by passing the list to a loop I have to check the job async_status.
So could some one help me here?
tasks:
-name:
command:
when: invenory_hostname == x
async: 360
poll:0
regsiter: "here dynamic variable"
-name:
command:
when: invenory_hostname == x
async: 360
poll:0
regsiter: "here dynamic variable"
-name:
command:
when: invenory_hostname == x
async: 360
poll:0
regsiter: "here dynamic variable" #his will continue based on the requirments
-name: collect the job ids
async_status:
jid:{item}
with_items:"list which has all the dynamically registered variables"
If you can write this as a loop instead of a series of independent tasks this becomes much easier. E.g:
tasks:
- command: "{{ item }}"
register: results
loop:
- "command1 ..."
- "command2 ..."
- name: show command output
debug:
msg: "{{ item.stdout }}"
loop: "{{ results.results }}"
The documentation on "Registering variables with a loop" discusses what the structure of results would look like after this task executes.
If you really need to write independent tasks instead, you could use the
vars lookup to find the results from all the tasks like this:
tasks:
- name: task 1
command: echo task1
register: task_result_1
- name: task 2
command: echo task2
register: task_result_2
- name: task 3
command: echo task3
register: task_result_3
- name: show results
debug:
msg: "{{ item }}"
loop: "{{ q('vars', *q('varnames', '^task_result_')) }}"
loop_control:
label: "{{ item.cmd }}"
You've updated the question to show that you're using async tasks, so
that changes things a bit. In this example, we use an until loop
that waits for each job to complete before checking the status of the
next job. The gather results task won't exit until all the async
tasks have completed.
Here's the solution using a loop:
- hosts: localhost
gather_facts: false
tasks:
- name: run tasks
command: "{{ item }}"
async: 360
poll: 0
register: task_results
loop:
- sleep 1
- sleep 5
- sleep 10
- name: gather results
async_status:
jid: "{{ item.ansible_job_id }}"
register: status
until: status.finished
loop: "{{ task_results.results }}"
- debug:
var: status
And the same thing using individual tasks:
- hosts: localhost
gather_facts: false
tasks:
- name: task 1
command: sleep 1
async: 360
poll: 0
register: task_result_1
- name: task 2
command: sleep 5
async: 360
poll: 0
register: task_result_2
- name: task 3
command: sleep 10
async: 360
poll: 0
register: task_result_3
- name: gather results
async_status:
jid: "{{ item.ansible_job_id }}"
register: status
until: status.finished
loop: "{{ q('vars', *q('varnames', '^task_result_')) }}"
- debug:
var: status

delegate_to group with include_role runs command on local machine?

I am trying to debug a playbook I've written which uses a couple of roles to spin up and then configure an AWS instance.
The basic structure is one playbook (new-server.yml) imports two roles -- roles/ec2_instance and roles/start_env. The ec2_instance role should be ran on localhost with my AWS tokens and then the start_env role gets ran on servers which are generated by the first role.
My playbook new-server.yml starts off like this:
- name: provision new instance
include_role:
name: ec2_instance
public: yes
vars:
instance_name: "{{ item.host_name }}"
env: "{{ item.git_branch }}"
env_type: "{{ item.env_type }}"
loop:
- { host_name: 'prod', git_branch: 'master', env_type: 'prod' }
- { host_name: 'test', git_branch: 'test', env_type: 'devel'}
This role builds an ec2 instance, updates route 53, uses add_host to add the host to the in-memory inventory in the just_created group.
Next, I have this in the new_server.yml playbook. Both of my IPs show up here just fine. My localhost does not show up here.
- name: debug just_created group
debug: msg="{{ groups['just_created'] }}"
Finally, again in new_server.yml, I try to do the last mile configuration and start my application on the new instance:
- name: Configure and start environment on new instance
include_role:
name: start_env
apply:
become: yes
delegate_to: "{{ item }}"
with_items:
- "{{ groups['just_created'] }}"
However, it doesnt look like the task is delegating properly, because I have this task in roles/start_env/main.yml:
- name: debug hostname
debug: msg="{{ ansible_hostname }}"
And what I'm seeing in my output is
TASK [start_env : debug hostname] ************************************************************************************************************************************
Monday 11 January 2021 12:00:05 -0800 (0:00:00.111) 0:00:37.374 ********
ok: [localhost -> 10.20.15.225] => {
"msg": "My-Local-MBP"
}
TASK [start_env : debug hostname] ************************************************************************************************************************************
Monday 11 January 2021 12:00:05 -0800 (0:00:00.043) 0:00:37.417 ********
ok: [localhost -> 10.20.31.35] => {
"msg": "My-Local-MBP"
}
I've read a lot about delegate_to, include_role and loops this morning. It sounds like Ansible has made things pretty complicated when you want to combine these things, but it also seems like the way I am trying to invoke these should be right. Any idea what I'm doing wrong (or if there is a smarter way to do this? I found this and while its a clever workaround, it doesn't quite fit what I'm seeing and I'd like to avoid creating another tasks file in my roles. Not exactly how I want to manage something like this. Most of the information I've been going off of has been this thread https://github.com/ansible/ansible/issues/35398
I guess this is a known issue... the output shows [localhost -> 10.20.31.35] which indicates it is delegating from localhost to 10.20.31.35, however this is only for the connection. Any templating done in the task definition uses the values of the host in the loop, which is localhost.
I figured out something in my own way that allows me to most keep what I've already written the same. I modified my add_host task to use the instance_name var as the hostname and the ec2 IP as the ansible_host instance var and then updated my last task to
roles/aws.yml:
- name: Add new instance to inventory
add_host:
hostname: "{{ instance_name }}"
ansible_host: "{{ ec2_private_ip }}"
ansible_user: centos
ansible_ssh_private_key_file: ../keys/my-key.pem
groups: just_created
new_servers.yml:
tasks:
- name: provision new instance
include_role:
name: ec2_instance
public: yes
vars:
instance_name: "{{ item.host_name }}"
env: "{{ item.git_branch }}"
env_type: "{{ item.env_type }}"
loop:
- { host_name: 'prod', git_branch: 'master', env_type: 'prod' }
- { host_name: 'test', git_branch: 'test', env_type: 'devel'}
- name: Configure and start environment on new instance
include_role:
name: start_env
apply:
become: yes
delegate_to: "{{ item }}"
vars:
instance_name: "{{ item }}"
with_items:
- "{{ groups['just_created'] }}"
Not pretty but it works well enough and lets me avoid duplicate code in the subsequent included roles.

Is there variables defined in one role's default folder will be covered by vars defined in playbook that call that role?

There are some roles and one playbook to call role.
One role is defined to start or stop ec2 instance by the condition ec2_status(running to execute the start task and stop to execute stop task).
so, if I defined vars in outer playbook like this
- hosts: CI-Master
vars:
ec2_status: running
roles:
- role: roles/env-CI/ec2-control
tags: ['ec2-start']
And in my ec2-control role is this
- name: start instances specified by a tag
ec2:
instance_tags: '{"{{ tag_key }}":"{{ tag_value }}"}'
region: "{{ region }}"
state: running
wait: true
when: ec2_status == "running"
- name: stop an instance
ec2:
instance_tags: '{"{{ tag_key }}":"{{ tag_value }}"}'
region: "{{ region }}"
state: stopped
wait: true
when: ec2_status == "stop"
Why when I run the playbook both two task are called.
I defined the ec2_status in ec2-control role's default folder like this:
region: us-west-1
ec2_status: running
required_vars:
- tag_key
- tag_value
- region
- ec2_status

Getting a meaningful error message from async_status loop when retries exceeded

Using Ansible 1.7, I am executing multiple asynchronous tasks in a loop, and checking on their status with async_status with a 30-second timeout:
- name: Some long-running task
shell: "./some_task {{ item }}"
loop: "{{ list_of_params }}"
async: 30
poll: 0
register: my_task
- name: Check async tasks for completion
async_status:
jid: "{{ item.ansible_job_id }}"
register: my_task_results
until: my_task_results.finished
retries: 30
delay: 1
loop: "{{ my_task.results }}"
This works well, and the async_status task fails if any of the shell commands return non-zero return-code, or if "my_task_results.finished" is not true within 30 seconds.
Unfortunately, the error message is not helpful when the "until" condition is not met in time. The returned values include:
changed: false
msg: "All items completed"
results: [ array of results from shell task above ]
changed: false
failed: true
finished: 0
Particularly, the "All items completed" message is misleading.
Is there a way to produce a meaningful error message in this case? I can add a "failed_when" option, with an additional "Fail" task to check condition (finished == 0) to display a custom error message (Something to the effect of "Some long-running task did not complete in time"), but this seems inelegant.
I think your solution will be fine in this case.
Please look to Official Ansible Docs:
https://docs.ansible.com/ansible/2.5/user_guide/playbooks_loops.html#id10
Here you can find your method:
- name: Fail if return code is not 0
fail:
msg: "The command ({{ item.cmd }}) did not have a 0 return code"
when: item.rc != 0
loop: "{{ echo.results }}"
Sometimes when I need more beauty - more descriptive output I'm using template/file, then with a registered variable in the task, you can customize your output.
For example:
- name:
template:
...
- shell: cat <template_location>.yml
register: stack_test
- debug:
msg: "{{ stack_test.stdout }}"

Resources