Parallel execution of localhost tasks in Ansible - performance

I'm using community.vmware.vmware_guest_powerstate collection for Ansible to start VMs.
The problem is the time it takes for 1 VM can be 2-5 sec, which makes its very inefficient when I want to start 50 VMs ...
Is there any way to make it in parallel?
The playbook:
- hosts: localhost
gather_facts: false
collections:
- community.vmware
vars:
certvalidate: "no"
server_url: "vc01.x.com"
username: "{{ lookup('ansible.builtin.env', 'API_USER', default=Undefined) }}"
password: "{{ lookup('ansible.builtin.env', 'API_PASS', default=Undefined) }}"
tasks:
- name: "setting state={{ requested_state }} in vcenter"
community.vmware.vmware_guest_powerstate:
username: "{{ lookup('ansible.builtin.env', 'API_USER', default=Undefined) }}"
password: "{{ lookup('ansible.builtin.env', 'API_PASS', default=Undefined) }}"
hostname: "{{ server_url }}"
datacenter: "DC1"
validate_certs: no
name: "{{ item }}"
state: "powered-on"
loop: "{{ hostlist }}"
This is Ansible's output: (every line can take 2-5 sec ...)
TASK [setting state=powered-on in vcenter] ************************************************************************************************************
Monday 19 September 2022 11:17:59 +0000 (0:00:00.029) 0:00:08.157 ******
changed: [localhost] => (item=x1.com)
changed: [localhost] => (item=x2.com)
changed: [localhost] => (item=x3.com)
changed: [localhost] => (item=x4.com)
changed: [localhost] => (item=x5.com)
changed: [localhost] => (item=x6.com)
changed: [localhost] => (item=x7.com)

try this instead...
- hosts: all
gather_facts: false
collections:
- community.vmware
vars:
certvalidate: "no"
server_url: "vc01.x.com"
username: "{{ lookup('ansible.builtin.env', 'API_USER', default=Undefined) }}"
password: "{{ lookup('ansible.builtin.env', 'API_PASS', default=Undefined) }}"
tasks:
- name: "setting state={{ requested_state }} in vcenter"
community.vmware.vmware_guest_powerstate:
username: "{{ username }}"
password: "{{ password }}"
hostname: "{{ server_url }}"
datacenter: "DC1"
validate_certs: no
name: "{{ inventory_hostname }}"
state: "powered-on"
delegate_to: localhost
Then run it with your hostlist as the inventory and use forks:
ansible-playbook -i x1.com,x2.com,x3.com,... --forks 10 play.yml

... the time it takes for 1 VM can be 2-5 sec, which makes its very inefficient when I want to start 50 VMs ...
Right, this is the usual behavior.
Is there any way to make it in parallel?
As already mentioned within the comments by Vladimir Botka, asynchronous actions and polling is worth a try since
By default Ansible runs tasks synchronously, holding the connection to the remote node open until the action is completed. This means within a playbook, each task blocks the next task by default, meaning subsequent tasks will not run until the current task completes. This behavior can create challenges.
You see it in your case in the task and in a loop.
Probably the Best Practice to address the use case and to eliminate the cause is to enhance the module code.
According the documentation vmware_guest_powerstate module – Manages power states of virtual machines in vCenter and source ansible-collections/community.vmware/blob/main/plugins/modules/vmware_guest_powerstate.py, the parameter name: takes one name for one VM only. If it would be possible to provide a list of VM names "{{ hostlist }}" to the module directly, there would be one connection attempt only and the loop happening one the Remote Node instead of the Controller Node (... even if this is running localhost for both cases).
To do so one would need to start with name=dict(type='list') instead of str and implement all other logic, error handling and responses.
Further Documentation
Since the community vmware_guest_powerstate module is importing and utilizing additional libraries
pyVmomi library
pyVmomi Community Samples
Meanwhile and based on
Further Q&A and Tests
How do I optimize performance of Ansible playbook with regards to SSH connections?
I've setup another short performance test to simulate the behavior you are observing
---
- hosts: localhost
become: false
gather_facts: false
tasks:
- name: Gather subdirectories
shell:
cmd: "ls -d /home/{{ ansible_user }}/*/"
warn: false
register: subdirs
- name: Gather stats (loop) async
shell: "stat {{ item }}"
loop: "{{ subdirs.stdout_lines }}"
loop_control:
label: "{{ item }}"
async: 5
poll: 0
- name: Gather stats (loop) serial
shell: "stat {{ item }}"
loop: "{{ subdirs.stdout_lines }}"
loop_control:
label: "{{ item }}"
- name: Gather stats (list)
shell: "stat {% raw %}{{% endraw %}{{ subdirs.stdout_lines | join(',') }}{% raw %}}{% endraw %}"
register: result
- name: Show result
debug:
var: result.stdout
and found that adding async will add some additional overhead resulting into even longer execution time.
Gather subdirectories ------------------------ 0.57s
Gather stats (loop) async -------------------- 3.99s
Gather stats (loop) serial ------------------- 3.79s
Gather stats (list) -------------------------- 0.45s
Show result ---------------------------------- 0.07s
This is because of the "short" runtime of the executed task in comparison to "long" time establishing a connection. As the documentation pointed out
For example, a task may take longer to complete than the SSH session allows for, causing a timeout. Or you may want a long-running process to execute in the background while you perform other tasks concurrently. Asynchronous mode lets you control how long-running tasks execute.
one may take advantage from async in case of long running processes and tasks.
In respect the given answer from #Sonclay I've performed another test with
---
- hosts: all
become: false
gather_facts: false
tasks:
- name: Gather subdirectories
shell:
cmd: "ls -d /home/{{ ansible_user }}/*/"
warn: false
register: subdirs
delegate_to: localhost
- name: Gather stats (loop) serial
shell: "stat {{ item }}"
loop: "{{ subdirs.stdout_lines }}"
loop_control:
label: "{{ item }}"
delegate_to: localhost
whereby a call with
ansible-playbook -i "test1.example.com,test2.example.com,test3.example.com" --forks 3 test.yml
will result into an execution time of
Gather subdirectories ------------------------ 0.72s
Gather stats (loop) -------------------------- 0.39s
so it seems to be worth a try.

Related

How to iterate over a list in a condition

I have two hosts: One in production, another one in test.
test and prod are defined in a fact file available on target hosts (nodes).
[node]
type= prod
or
[node]
type= test
I have the following variables defined:
users:
- username: A
password: password_A
update_password: always
home: /home/A
state: present
nodes: ['test', 'prod']
- username: B
password: passwd_B
update_password: always
home: /home/B
state: present
nodes: ['test']
My A user shall be installed on production and test hosts.
B user only on test host.
Hereafter a role that works fine if i use a single value for nodes definition.
- name: create users
ansible.builtin.user:
name: "{{ item.username }}"
password: "{{ item.password }}"
uid: "{{ item.uid }}"
home: "{{ item.home }}"
create_home: yes
group : "{{ item.group }}"
shell: /bin/bash
state: present
expires: -1
with_items:
- "{{ users }}"
when: item.nodes == ansible_local['myfact']['node"']['type']
I don't know how to loop on each value of the item.nodes list and compare them with the local fact value. item.nodes[0], item.nodes[1], ... I might have other type of host, not only prod and test.
I tried subelements without success.
You don't need to iterate anything in your condition, you can assert that an element is in a list with the in test.
So your condition needs to be
when: ansible_local.myfact.node.type in item.nodes
Q: "A user shall be installed on production and test hosts.
B user only on test host."
A: Condition is not needed. Use filter selectattr and test if a list contains a value. For example, given the inventory for testing
shell> cat hosts
prod type=prod
test type=test
The task
shell> cat pb.yml
- hosts: all
tasks:
- debug:
msg: "Create user {{ item.username }}"
loop: "{{ users|selectattr('nodes', 'contains', type) }}"
loop_control:
label: "{{ item.username }}"
iterates selected users only
TASK [debug] ******************************************************
ok: [prod] => (item=A) =>
msg: Create user A
ok: [test] => (item=A) =>
msg: Create user A
ok: [test] => (item=B) =>
msg: Create user B

Ansible - Prevent playbook executing simultaneously

I have a playbook that controls a clustered application. The issue is this playbook can be called/executed a few different ways (manual on the cmd line[multiple SREs working], scheduled task, or programmatically via a 3rd party system).
The problem is if the playbook tries to execute simultaneously, it could cause some issues to the application (nature of the application).
Question:
Is there a way to prevent the same playbook from running concurrently on the same Ansible server?
Environment:
ansible [core 2.11.6]
config file = /app/ansible/ansible_linux_playbooks/playbooks/scoutam_client_configs_playbook/ansible.cfg
configured module search path = ['/etc/ansible/library/modules']
ansible python module location = /usr/local/lib/python3.9/site-packages/ansible
ansible collection location = /app/ansible/ansible_linux_playbooks/playbooks/scoutam_client_configs_playbook/collections
executable location = /usr/local/bin/ansible
python version = 3.9.7 (default, Nov 1 2021, 11:34:21) [GCC 8.4.1 20200928 (Red Hat 8.4.1-1)]
jinja version = 3.0.2
libyaml = True
you could test if file exist at the start of playbook and stop the play if the file exist with meta, if not you create the file to block another launch:
- name: lock_test
hosts: all
vars:
lock_file_path: /tmp/ansible-playbook.lock
pre_tasks:
- name: Check if some file exists
delegate_to: localhost
stat:
path: "{{ lock_file_path }}"
register: lock_file
- block:
- name: "end play "
debug:
msg: "playbook already launched, ending play"
- meta: end_play
when: lock_file.stat.exists
- name: create lock_file {{ lock_file_path }}
delegate_to: localhost
file:
path: "{{ lock_file_path }}"
state: touch
# ****************** tasks start
tasks:
- name: debug
debug:
msg: "something to do"
# ****************** tasks end
post_tasks:
- name: delete the lock file {{ lock_file_path }}
delegate_to: localhost
file:
path: "{{ lock_file_path }}"
state: absent
but you have to have only one playbook in your play even the first playbook stops, the second is launched except if you do the same test in the next playbook.
it exist a little lapse time before test and creation of file... so the probality to launch twice the same playbook in same second is very low.
The solution will be always better than you have actually
Another solution is to lock an existing file, and test if file is locked or not, but be careful with this option.. see lock, flock in unix command
You can create a lockfile on the controller with the PID of the ansible-playbook process.
- delegate_to: localhost
vars:
lockfile: /tmp/thisisalockfile
my_pid: "{{ lookup('pipe', 'cut -d\" \" -f4 /proc/$PPID/stat') }}"
lock_pid: "{{ lookup('file', lockfile) }}"
block:
- name: Lock file
copy:
dest: "{{ lockfile }}"
content: "{{ my_pid }}"
when: my_lockfile is not exists
or ('/proc/' ~ lock_pid) is not exists
or 'ansible-playbook' not in lookup('file', '/proc/' ~ lock_pid ~ '/cmdline')
- name: Make sure we won the lock
assert:
that: lock_pid == my_pid
fail_msg: "{{ lockfile }} is locked by process {{ lock_pid }}"
Finding the current PID is the trickiest part; $PPID in the lookup is still the PID of a child, so we're grabbing the grandparent out of /proc/
I wanted to post this here but do not consider it a final/perfect answer.
it does work for general purposes.
I put this 'playbook_lock.yml' at the root of my playbook and call it in before any roles.
playbook_lock.yml:
# ./playbook_lock.yml
#
## NOTES:
## - Uses '/tmp/' on Ansible server as lock file directory
## - Format of lock file: E.g. 129416_20211103094638_playbook_common_01.lock
## -- Detailed explanation further down
## - Race-condition:
## -- Assumption playbooks will not run within 10sec of each other
## -- Assumption lockfiles were not deleted within 10sec
## -- If running the playbook manually with manual input of Ansible Vault
## --- Enter creds within 10 sec or the playbook will consider this run legacy
## - Built logic to only use ansbile.builin modules to not add additional requirements
##
#
---
## Build a transaction ID from year/month/day/hour/min/sec
- name: debug_transactionID
debug:
msg: "{{ transactionID }}"
vars:
filter: "{{ ansible_date_time }}"
transactionID: "{{ filter.year + filter.month + filter.day + filter.hour + filter.minute + filter.second }}"
run_once: true
delegate_to: localhost
register: reg_transactionID
## Find current playbook PID
## Race-condition => assumption playbooks will not run within 10sec of each other
## If playbook is already running >10secs, this return will be empty
- name: debug_current_playbook_pid
ansible.builtin.shell:
## serach PS for any command matching the name of the playbook | remove the 'grep' result | return only the 1st one (if etime < 10sec)
cmd: "ps -e -o 'pid,etimes,cmd' | grep {{ ansible_play_name }} | grep -v grep | awk 'NR==1{if($2<10) print $1}'"
changed_when: false
run_once: true
delegate_to: localhost
register: reg_current_playbook_pid
## Check for existing lock files
- name: find_existing_lock_files
ansible.builtin.find:
paths: /tmp
patterns: "*_{{ ansible_play_name }}.lock"
age: 1s
run_once: true
delegate_to: localhost
register: reg_existing_lock_files
## Check and verify existing lock files
- name: block_discovered_existing_lock_files
block:
## build fact of all lock files discovered
- name: fact_existing_lock_files
ansible.builtin.set_fact:
fact_existing_lock_files: "{{ fact_existing_lock_files | default([]) + [item.path] }}"
loop: "{{ reg_existing_lock_files.files }}"
run_once: true
delegate_to: localhost
when:
- reg_existing_lock_files.matched > 0
## Build fact of all discovered lock files
- name: fact_playbook_lock_file_dict
ansible.builtin.set_fact:
fact_playbook_lock_file_dict: "{{ fact_playbook_lock_file_dict | default([]) + [data] }}"
vars:
## E.g. lockfile => 129416_20211103094638_playbook_common_01.lock
var_pid: "{{ item.split('/')[2].split('_')[0] }}" ## extract the 1st portion = PID
var_transid: "{{ item.split('/')[2].split('_')[1] }}" ## extract 2nd portion = TransactionID
var_playbook: "{{ item.split('/')[2].split('_')[2:] | join('_') }}" ## Extract the remaining and join back together = playbook file
data:
{pid: "{{ var_pid }}", transid: "{{ var_transid }}", playbook: "{{ var_playbook }}"}
loop: "{{ fact_existing_lock_files }}"
run_once: true
delegate_to: localhost
## Check each discovered lock file
## Verify the PID is still operational
- name: shell_verify_pid_is_active
ansible.builtin.shell:
cmd: "ps -p {{ item.pid }} | awk 'NR==2{print $1}'"
loop: "{{ fact_playbook_lock_file_dict }}"
changed_when: false
delegate_to: localhost
register: reg_verify_pid_is_active
## Build fact of discovered previous playbook PIDs
- name: fact_previous_playbook_pids
ansible.builtin.set_fact:
fact_previous_playbook_pids: "{{ fact_previous_playbook_pids | default([]) + [item.stdout | int] }}"
loop: "{{ reg_verify_pid_is_active.results }}"
run_once: true
delegate_to: localhost
## Build fact is playbook already operational
## Add PIDs together
## If SUM =0 => No PIDs found (no previous playbooks running)
## If SUM != 0 => previous playbook is still operational
- name: fact_previous_playbook_operational
ansible.builtin.set_fact:
fact_previous_playbook_operational: "{{ ((fact_previous_playbook_pids | sum) | int) != 0 }}"
when:
- reg_existing_lock_files.matched > 0
- reg_current_playbook_pid.stdout is defined
## Continue with playbook, as no previous instances running
- name: block_continue_playbook_operations
block:
## Cleanup legacy lock files, as the PIDs are not operational
- name: stat_cleanup_legacy_lock_files
ansible.builtin.file:
path: "{{ item }}"
state: absent
loop: "{{ fact_existing_lock_files }}"
run_once: true
delegate_to: localhost
when: fact_existing_lock_files | length >= 1
## Create lock file for current playbook
- name: stat_create_playbook_lock_file
ansible.builtin.file:
path: "/tmp/{{ var_playbook_lock_file }}"
state: touch
mode: '0644'
vars:
var_playbook_lock_file: "{{ reg_current_playbook_pid.stdout }}_{{ reg_transactionID.msg }}_{{ ansible_play_name }}.lock"
run_once: true
delegate_to: localhost
when:
- reg_current_playbook_pid.stdout is defined
## Fail & exit playbook, as previous playbook is still operational
- name: block_playbook_already_operational
block:
- name: fail
fail:
msg: 'Playbook "{{ ansible_play_name }}" is already operational! This playbook will now exit without any modifications!!!'
run_once: true
delegate_to: localhost
when: (fact_previous_playbook_operational is true) or
(reg_current_playbook_pid.stdout is not defined)
...

ansible to perform a task based on hostname value

Hi I am trying below.
task:
- name: Perform on primary server
blockinfile:
path: '/home/conf'
marker: "#-- {mark} Adding Values --"
block: |
{{ conf }}
when: "'host01' in inventory_hostname"
- name: Perform on stdby server
blockinfile:
path: '/home/conf'
marker: "#-- {mark} Adding Values --"
block: |
{{ conf_stdby }}
when: "'host02' in inventory_hostname"
As every task is performed simultaneously on all hosts, i was expecting it would change host01 in 1st task and skip host02 and vice-versa in second task, However it changed in both hosts in both tasks and when i checked the servers both had conf_stdby.
Also there are many more tasks in my playbook which are common to both the hosts.
inventory_hostname wouldn't work as in inventory file of playbook there is ip no hostname, so is there a way i can use actual host's hostname in when condition?
Even tried this
vars:
my_conf:
host01: "{{ conf }}"
host02: "{{ conf_stdby }}"
task:
- name: Perform on primary server
blockinfile:
path: '/home/conf'
marker: "#-- {mark} Adding Values --"
block: |
{{ conf }}
when: ""{{hostvars[inventory_hostname].ansible_hostname}} in myconf"
Still both host add the same block
You're doing all right. The code should work. To test the strings are equal, usually "==" is used instead of the inclusion "in". But, you can reference the configuration data in a dictionary and simplify the code, e.g.
vars:
my_conf:
host01: "{{ conf }}"
host02: "{{ conf_stdby }}"
task:
- blockinfile:
path: /home/conf
marker: "#-- {mark} Adding Values --"
block: |
{{ my_conf[inventory_hostname] }}
when: "inventory_hostname in my_conf"
Q: "In the inventory, there is no hostname but ip."
A: Whatever you have got in the inventory put it into the dictionary. There are no restrictions on keys in the YAML dictionaries, e.g. the inventory
shell> cat hosts
10.1.0.61
10.1.0.62
and the playbook
shell> cat playbook.yml
- hosts: all
gather_facts: false
vars:
my_conf:
10.1.0.61: conf
10.1.0.62: conf_sdtdby
tasks:
- debug:
msg: "{{ my_conf[inventory_hostname] }}"
gives
ok: [10.1.0.61] =>
msg: conf
ok: [10.1.0.62] =>
msg: conf_sdtdby

delegate_to group with include_role runs command on local machine?

I am trying to debug a playbook I've written which uses a couple of roles to spin up and then configure an AWS instance.
The basic structure is one playbook (new-server.yml) imports two roles -- roles/ec2_instance and roles/start_env. The ec2_instance role should be ran on localhost with my AWS tokens and then the start_env role gets ran on servers which are generated by the first role.
My playbook new-server.yml starts off like this:
- name: provision new instance
include_role:
name: ec2_instance
public: yes
vars:
instance_name: "{{ item.host_name }}"
env: "{{ item.git_branch }}"
env_type: "{{ item.env_type }}"
loop:
- { host_name: 'prod', git_branch: 'master', env_type: 'prod' }
- { host_name: 'test', git_branch: 'test', env_type: 'devel'}
This role builds an ec2 instance, updates route 53, uses add_host to add the host to the in-memory inventory in the just_created group.
Next, I have this in the new_server.yml playbook. Both of my IPs show up here just fine. My localhost does not show up here.
- name: debug just_created group
debug: msg="{{ groups['just_created'] }}"
Finally, again in new_server.yml, I try to do the last mile configuration and start my application on the new instance:
- name: Configure and start environment on new instance
include_role:
name: start_env
apply:
become: yes
delegate_to: "{{ item }}"
with_items:
- "{{ groups['just_created'] }}"
However, it doesnt look like the task is delegating properly, because I have this task in roles/start_env/main.yml:
- name: debug hostname
debug: msg="{{ ansible_hostname }}"
And what I'm seeing in my output is
TASK [start_env : debug hostname] ************************************************************************************************************************************
Monday 11 January 2021 12:00:05 -0800 (0:00:00.111) 0:00:37.374 ********
ok: [localhost -> 10.20.15.225] => {
"msg": "My-Local-MBP"
}
TASK [start_env : debug hostname] ************************************************************************************************************************************
Monday 11 January 2021 12:00:05 -0800 (0:00:00.043) 0:00:37.417 ********
ok: [localhost -> 10.20.31.35] => {
"msg": "My-Local-MBP"
}
I've read a lot about delegate_to, include_role and loops this morning. It sounds like Ansible has made things pretty complicated when you want to combine these things, but it also seems like the way I am trying to invoke these should be right. Any idea what I'm doing wrong (or if there is a smarter way to do this? I found this and while its a clever workaround, it doesn't quite fit what I'm seeing and I'd like to avoid creating another tasks file in my roles. Not exactly how I want to manage something like this. Most of the information I've been going off of has been this thread https://github.com/ansible/ansible/issues/35398
I guess this is a known issue... the output shows [localhost -> 10.20.31.35] which indicates it is delegating from localhost to 10.20.31.35, however this is only for the connection. Any templating done in the task definition uses the values of the host in the loop, which is localhost.
I figured out something in my own way that allows me to most keep what I've already written the same. I modified my add_host task to use the instance_name var as the hostname and the ec2 IP as the ansible_host instance var and then updated my last task to
roles/aws.yml:
- name: Add new instance to inventory
add_host:
hostname: "{{ instance_name }}"
ansible_host: "{{ ec2_private_ip }}"
ansible_user: centos
ansible_ssh_private_key_file: ../keys/my-key.pem
groups: just_created
new_servers.yml:
tasks:
- name: provision new instance
include_role:
name: ec2_instance
public: yes
vars:
instance_name: "{{ item.host_name }}"
env: "{{ item.git_branch }}"
env_type: "{{ item.env_type }}"
loop:
- { host_name: 'prod', git_branch: 'master', env_type: 'prod' }
- { host_name: 'test', git_branch: 'test', env_type: 'devel'}
- name: Configure and start environment on new instance
include_role:
name: start_env
apply:
become: yes
delegate_to: "{{ item }}"
vars:
instance_name: "{{ item }}"
with_items:
- "{{ groups['just_created'] }}"
Not pretty but it works well enough and lets me avoid duplicate code in the subsequent included roles.

Ansible Cisco configuration compliance check for invalid users

I am attempting to validate a Cisco configuration with Ansible. I want to be able to tell whether any users have been configured other than the valid ones.
Valid users:
username admin,
username readonly
Invalid users:
username secretbackdoor
I have tried to create a list of users, then flag any which are not valid. The code i have so far is as follows:
---
- hosts: cisco
gather_facts: no
tasks:
- name: show run
ios_command:
commands:
- show run
register: cisco_show_run
- name: list_cisco_usernames
set_fact: cisco_usernames="{{ cisco_show_run.stdout[0] | regex_findall('username (\S+)', multiline=True) }}"
- name: print usernames
debug:
msg: {{ item }}
with_items: "{{ cisco_usernames }}"
This will print out the three users. Not sure where to go next.
"Set Theory Filters" might be next option. For example
- hosts: localhost
vars:
valid_users: [admin, readonly]
invalid_users: [secretbackdoor]
cisco_usernames: [admin, readonly, secretbackdoor]
tasks:
- name: Display users not in valid_users
debug:
msg: Not among valid users {{ not_valid }}
when: not_valid|length > 0
vars:
not_valid: "{{ cisco_usernames|difference(valid_users) }}"
- name: Display users in invalid_users
debug:
msg: Among invalid users {{ not_valid }}
when: not_valid|length > 0
vars:
not_valid: "{{ cisco_usernames|intersect(invalid_users) }}"
gives (abridged)
ok: [localhost] =>
msg: Not among valid users ['secretbackdoor']
ok: [localhost] =>
msg: Among invalid users ['secretbackdoor']
Thanks for this. Your solution is working fine. I put in the first option, as I do not always know what the 'incorrect' users are.

Resources