I am trying to debug a playbook I've written which uses a couple of roles to spin up and then configure an AWS instance.
The basic structure is one playbook (new-server.yml) imports two roles -- roles/ec2_instance and roles/start_env. The ec2_instance role should be ran on localhost with my AWS tokens and then the start_env role gets ran on servers which are generated by the first role.
My playbook new-server.yml starts off like this:
- name: provision new instance
include_role:
name: ec2_instance
public: yes
vars:
instance_name: "{{ item.host_name }}"
env: "{{ item.git_branch }}"
env_type: "{{ item.env_type }}"
loop:
- { host_name: 'prod', git_branch: 'master', env_type: 'prod' }
- { host_name: 'test', git_branch: 'test', env_type: 'devel'}
This role builds an ec2 instance, updates route 53, uses add_host to add the host to the in-memory inventory in the just_created group.
Next, I have this in the new_server.yml playbook. Both of my IPs show up here just fine. My localhost does not show up here.
- name: debug just_created group
debug: msg="{{ groups['just_created'] }}"
Finally, again in new_server.yml, I try to do the last mile configuration and start my application on the new instance:
- name: Configure and start environment on new instance
include_role:
name: start_env
apply:
become: yes
delegate_to: "{{ item }}"
with_items:
- "{{ groups['just_created'] }}"
However, it doesnt look like the task is delegating properly, because I have this task in roles/start_env/main.yml:
- name: debug hostname
debug: msg="{{ ansible_hostname }}"
And what I'm seeing in my output is
TASK [start_env : debug hostname] ************************************************************************************************************************************
Monday 11 January 2021 12:00:05 -0800 (0:00:00.111) 0:00:37.374 ********
ok: [localhost -> 10.20.15.225] => {
"msg": "My-Local-MBP"
}
TASK [start_env : debug hostname] ************************************************************************************************************************************
Monday 11 January 2021 12:00:05 -0800 (0:00:00.043) 0:00:37.417 ********
ok: [localhost -> 10.20.31.35] => {
"msg": "My-Local-MBP"
}
I've read a lot about delegate_to, include_role and loops this morning. It sounds like Ansible has made things pretty complicated when you want to combine these things, but it also seems like the way I am trying to invoke these should be right. Any idea what I'm doing wrong (or if there is a smarter way to do this? I found this and while its a clever workaround, it doesn't quite fit what I'm seeing and I'd like to avoid creating another tasks file in my roles. Not exactly how I want to manage something like this. Most of the information I've been going off of has been this thread https://github.com/ansible/ansible/issues/35398
I guess this is a known issue... the output shows [localhost -> 10.20.31.35] which indicates it is delegating from localhost to 10.20.31.35, however this is only for the connection. Any templating done in the task definition uses the values of the host in the loop, which is localhost.
I figured out something in my own way that allows me to most keep what I've already written the same. I modified my add_host task to use the instance_name var as the hostname and the ec2 IP as the ansible_host instance var and then updated my last task to
roles/aws.yml:
- name: Add new instance to inventory
add_host:
hostname: "{{ instance_name }}"
ansible_host: "{{ ec2_private_ip }}"
ansible_user: centos
ansible_ssh_private_key_file: ../keys/my-key.pem
groups: just_created
new_servers.yml:
tasks:
- name: provision new instance
include_role:
name: ec2_instance
public: yes
vars:
instance_name: "{{ item.host_name }}"
env: "{{ item.git_branch }}"
env_type: "{{ item.env_type }}"
loop:
- { host_name: 'prod', git_branch: 'master', env_type: 'prod' }
- { host_name: 'test', git_branch: 'test', env_type: 'devel'}
- name: Configure and start environment on new instance
include_role:
name: start_env
apply:
become: yes
delegate_to: "{{ item }}"
vars:
instance_name: "{{ item }}"
with_items:
- "{{ groups['just_created'] }}"
Not pretty but it works well enough and lets me avoid duplicate code in the subsequent included roles.
Related
I am trying to use juniper_junos_facts from the Ansible Junos module to query some VM's that I provisioned using Vagrant. However I am getting the following error.
fatal: [r1]: FAILED! => {"changed": false, "msg": "Unable to make a PyEZ connection: ConnectUnknownHostError(r1)"}
fatal: [r2]: FAILED! => {"changed": false, "msg": "Unable to make a PyEZ connection: ConnectUnknownHostError(r2)"}
I see in the following document Here on juniper.net that this error occurs when you don't have the host defined correctly in the inventory file. I don't believe this to be an issue with my inventory file because when I run ansible-inventory --host all appears to be in order
~/vagrant-projects/junos$ ansible-inventory --host r1
{
"ansible_ssh_host": "127.0.0.1",
"ansible_ssh_port": 2222,
"ansible_ssh_private_key_file": ".vagrant/machines/r1/virtualbox/private_key",
"ansible_ssh_user": "root"
}
~/vagrant-projects/junos$ ansible-inventory --host r2
{
"ansible_ssh_host": "127.0.0.1",
"ansible_ssh_port": 2200,
"ansible_ssh_private_key_file": ".vagrant/machines/r2/virtualbox/private_key",
"ansible_ssh_user": "root"
}
My playbook is copied from the following document which I got from Here on juniper.net.
My Inventory File
[vsrx]
r1 ansible_ssh_host=127.0.0.1 ansible_ssh_port=2222 ansible_ssh_private_key_file=.vagrant/machines/r1/virtualbox/private_key
r2 ansible_ssh_host=127.0.0.1 ansible_ssh_port=2200 ansible_ssh_private_key_file=.vagrant/machines/r2/virtualbox/private_key
[vsrx:vars]
ansible_ssh_user=root
My Playbook
---
- name: show version
hosts: vsrx
roles:
- Juniper.junos
connection: local
gather_facts: no
tasks:
- name: retrieve facts
juniper_junos_facts:
host: "{{ inventory_hostname }}"
savedir: "{{ playbook_dir }}"
- name: print version
debug:
var: junos.version
As you're using connection: local you need to give the module full connection details (usually packaged in a provider dictionary at the play level to reduce repetition):
- name: retrieve facts
juniper_junos_facts:
host: "{{ ansible_ssh_host }}"
port: "{{ ansible_ssh_port }}"
user: "{{ ansible_ssh_user }}"
passwd: "{{ ansible_ssh_pass }}"
ssh_private_key_file: "{{ ansible_ssh_private_key_file }}"
savedir: "{{ playbook_dir }}"
Full docs are here (watch out for the correct role version in the URL): https://junos-ansible-modules.readthedocs.io/en/2.1.0/juniper_junos_facts.html where you can also see what the defaults are.
To fully explain the "provider" method, your playbook should look something like this:
---
- name: show version
hosts: vsrx
roles:
- Juniper.junos
connection: local
gather_facts: no
vars:
connection_info:
host: "{{ ansible_ssh_host }}"
port: "{{ ansible_ssh_port }}"
user: "{{ ansible_ssh_user }}"
passwd: "{{ ansible_ssh_pass }}"
ssh_private_key_file: "{{ ansible_ssh_private_key_file }}"
tasks:
- name: retrieve facts
juniper_junos_facts:
provider: "{{ connection_info }}"
savedir: "{{ playbook_dir }}"
- name: print version
debug:
var: junos.version
This answer for people who will find this question by error message.
If you use connection plugin different from local, it can, and usually caused by this bug related to variables ordering
Bug already fixed in Release 2.2.1 and later, try to update module from Galaxy.
I'm using community.vmware.vmware_guest_powerstate collection for Ansible to start VMs.
The problem is the time it takes for 1 VM can be 2-5 sec, which makes its very inefficient when I want to start 50 VMs ...
Is there any way to make it in parallel?
The playbook:
- hosts: localhost
gather_facts: false
collections:
- community.vmware
vars:
certvalidate: "no"
server_url: "vc01.x.com"
username: "{{ lookup('ansible.builtin.env', 'API_USER', default=Undefined) }}"
password: "{{ lookup('ansible.builtin.env', 'API_PASS', default=Undefined) }}"
tasks:
- name: "setting state={{ requested_state }} in vcenter"
community.vmware.vmware_guest_powerstate:
username: "{{ lookup('ansible.builtin.env', 'API_USER', default=Undefined) }}"
password: "{{ lookup('ansible.builtin.env', 'API_PASS', default=Undefined) }}"
hostname: "{{ server_url }}"
datacenter: "DC1"
validate_certs: no
name: "{{ item }}"
state: "powered-on"
loop: "{{ hostlist }}"
This is Ansible's output: (every line can take 2-5 sec ...)
TASK [setting state=powered-on in vcenter] ************************************************************************************************************
Monday 19 September 2022 11:17:59 +0000 (0:00:00.029) 0:00:08.157 ******
changed: [localhost] => (item=x1.com)
changed: [localhost] => (item=x2.com)
changed: [localhost] => (item=x3.com)
changed: [localhost] => (item=x4.com)
changed: [localhost] => (item=x5.com)
changed: [localhost] => (item=x6.com)
changed: [localhost] => (item=x7.com)
try this instead...
- hosts: all
gather_facts: false
collections:
- community.vmware
vars:
certvalidate: "no"
server_url: "vc01.x.com"
username: "{{ lookup('ansible.builtin.env', 'API_USER', default=Undefined) }}"
password: "{{ lookup('ansible.builtin.env', 'API_PASS', default=Undefined) }}"
tasks:
- name: "setting state={{ requested_state }} in vcenter"
community.vmware.vmware_guest_powerstate:
username: "{{ username }}"
password: "{{ password }}"
hostname: "{{ server_url }}"
datacenter: "DC1"
validate_certs: no
name: "{{ inventory_hostname }}"
state: "powered-on"
delegate_to: localhost
Then run it with your hostlist as the inventory and use forks:
ansible-playbook -i x1.com,x2.com,x3.com,... --forks 10 play.yml
... the time it takes for 1 VM can be 2-5 sec, which makes its very inefficient when I want to start 50 VMs ...
Right, this is the usual behavior.
Is there any way to make it in parallel?
As already mentioned within the comments by Vladimir Botka, asynchronous actions and polling is worth a try since
By default Ansible runs tasks synchronously, holding the connection to the remote node open until the action is completed. This means within a playbook, each task blocks the next task by default, meaning subsequent tasks will not run until the current task completes. This behavior can create challenges.
You see it in your case in the task and in a loop.
Probably the Best Practice to address the use case and to eliminate the cause is to enhance the module code.
According the documentation vmware_guest_powerstate module – Manages power states of virtual machines in vCenter and source ansible-collections/community.vmware/blob/main/plugins/modules/vmware_guest_powerstate.py, the parameter name: takes one name for one VM only. If it would be possible to provide a list of VM names "{{ hostlist }}" to the module directly, there would be one connection attempt only and the loop happening one the Remote Node instead of the Controller Node (... even if this is running localhost for both cases).
To do so one would need to start with name=dict(type='list') instead of str and implement all other logic, error handling and responses.
Further Documentation
Since the community vmware_guest_powerstate module is importing and utilizing additional libraries
pyVmomi library
pyVmomi Community Samples
Meanwhile and based on
Further Q&A and Tests
How do I optimize performance of Ansible playbook with regards to SSH connections?
I've setup another short performance test to simulate the behavior you are observing
---
- hosts: localhost
become: false
gather_facts: false
tasks:
- name: Gather subdirectories
shell:
cmd: "ls -d /home/{{ ansible_user }}/*/"
warn: false
register: subdirs
- name: Gather stats (loop) async
shell: "stat {{ item }}"
loop: "{{ subdirs.stdout_lines }}"
loop_control:
label: "{{ item }}"
async: 5
poll: 0
- name: Gather stats (loop) serial
shell: "stat {{ item }}"
loop: "{{ subdirs.stdout_lines }}"
loop_control:
label: "{{ item }}"
- name: Gather stats (list)
shell: "stat {% raw %}{{% endraw %}{{ subdirs.stdout_lines | join(',') }}{% raw %}}{% endraw %}"
register: result
- name: Show result
debug:
var: result.stdout
and found that adding async will add some additional overhead resulting into even longer execution time.
Gather subdirectories ------------------------ 0.57s
Gather stats (loop) async -------------------- 3.99s
Gather stats (loop) serial ------------------- 3.79s
Gather stats (list) -------------------------- 0.45s
Show result ---------------------------------- 0.07s
This is because of the "short" runtime of the executed task in comparison to "long" time establishing a connection. As the documentation pointed out
For example, a task may take longer to complete than the SSH session allows for, causing a timeout. Or you may want a long-running process to execute in the background while you perform other tasks concurrently. Asynchronous mode lets you control how long-running tasks execute.
one may take advantage from async in case of long running processes and tasks.
In respect the given answer from #Sonclay I've performed another test with
---
- hosts: all
become: false
gather_facts: false
tasks:
- name: Gather subdirectories
shell:
cmd: "ls -d /home/{{ ansible_user }}/*/"
warn: false
register: subdirs
delegate_to: localhost
- name: Gather stats (loop) serial
shell: "stat {{ item }}"
loop: "{{ subdirs.stdout_lines }}"
loop_control:
label: "{{ item }}"
delegate_to: localhost
whereby a call with
ansible-playbook -i "test1.example.com,test2.example.com,test3.example.com" --forks 3 test.yml
will result into an execution time of
Gather subdirectories ------------------------ 0.72s
Gather stats (loop) -------------------------- 0.39s
so it seems to be worth a try.
I have two hosts: One in production, another one in test.
test and prod are defined in a fact file available on target hosts (nodes).
[node]
type= prod
or
[node]
type= test
I have the following variables defined:
users:
- username: A
password: password_A
update_password: always
home: /home/A
state: present
nodes: ['test', 'prod']
- username: B
password: passwd_B
update_password: always
home: /home/B
state: present
nodes: ['test']
My A user shall be installed on production and test hosts.
B user only on test host.
Hereafter a role that works fine if i use a single value for nodes definition.
- name: create users
ansible.builtin.user:
name: "{{ item.username }}"
password: "{{ item.password }}"
uid: "{{ item.uid }}"
home: "{{ item.home }}"
create_home: yes
group : "{{ item.group }}"
shell: /bin/bash
state: present
expires: -1
with_items:
- "{{ users }}"
when: item.nodes == ansible_local['myfact']['node"']['type']
I don't know how to loop on each value of the item.nodes list and compare them with the local fact value. item.nodes[0], item.nodes[1], ... I might have other type of host, not only prod and test.
I tried subelements without success.
You don't need to iterate anything in your condition, you can assert that an element is in a list with the in test.
So your condition needs to be
when: ansible_local.myfact.node.type in item.nodes
Q: "A user shall be installed on production and test hosts.
B user only on test host."
A: Condition is not needed. Use filter selectattr and test if a list contains a value. For example, given the inventory for testing
shell> cat hosts
prod type=prod
test type=test
The task
shell> cat pb.yml
- hosts: all
tasks:
- debug:
msg: "Create user {{ item.username }}"
loop: "{{ users|selectattr('nodes', 'contains', type) }}"
loop_control:
label: "{{ item.username }}"
iterates selected users only
TASK [debug] ******************************************************
ok: [prod] => (item=A) =>
msg: Create user A
ok: [test] => (item=A) =>
msg: Create user A
ok: [test] => (item=B) =>
msg: Create user B
I am attempting to validate a Cisco configuration with Ansible. I want to be able to tell whether any users have been configured other than the valid ones.
Valid users:
username admin,
username readonly
Invalid users:
username secretbackdoor
I have tried to create a list of users, then flag any which are not valid. The code i have so far is as follows:
---
- hosts: cisco
gather_facts: no
tasks:
- name: show run
ios_command:
commands:
- show run
register: cisco_show_run
- name: list_cisco_usernames
set_fact: cisco_usernames="{{ cisco_show_run.stdout[0] | regex_findall('username (\S+)', multiline=True) }}"
- name: print usernames
debug:
msg: {{ item }}
with_items: "{{ cisco_usernames }}"
This will print out the three users. Not sure where to go next.
"Set Theory Filters" might be next option. For example
- hosts: localhost
vars:
valid_users: [admin, readonly]
invalid_users: [secretbackdoor]
cisco_usernames: [admin, readonly, secretbackdoor]
tasks:
- name: Display users not in valid_users
debug:
msg: Not among valid users {{ not_valid }}
when: not_valid|length > 0
vars:
not_valid: "{{ cisco_usernames|difference(valid_users) }}"
- name: Display users in invalid_users
debug:
msg: Among invalid users {{ not_valid }}
when: not_valid|length > 0
vars:
not_valid: "{{ cisco_usernames|intersect(invalid_users) }}"
gives (abridged)
ok: [localhost] =>
msg: Not among valid users ['secretbackdoor']
ok: [localhost] =>
msg: Among invalid users ['secretbackdoor']
Thanks for this. Your solution is working fine. I put in the first option, as I do not always know what the 'incorrect' users are.
I am trying to use juniper_junos_facts from the Ansible Junos module to query some VM's that I provisioned using Vagrant. However I am getting the following error.
fatal: [r1]: FAILED! => {"changed": false, "msg": "Unable to make a PyEZ connection: ConnectUnknownHostError(r1)"}
fatal: [r2]: FAILED! => {"changed": false, "msg": "Unable to make a PyEZ connection: ConnectUnknownHostError(r2)"}
I see in the following document Here on juniper.net that this error occurs when you don't have the host defined correctly in the inventory file. I don't believe this to be an issue with my inventory file because when I run ansible-inventory --host all appears to be in order
~/vagrant-projects/junos$ ansible-inventory --host r1
{
"ansible_ssh_host": "127.0.0.1",
"ansible_ssh_port": 2222,
"ansible_ssh_private_key_file": ".vagrant/machines/r1/virtualbox/private_key",
"ansible_ssh_user": "root"
}
~/vagrant-projects/junos$ ansible-inventory --host r2
{
"ansible_ssh_host": "127.0.0.1",
"ansible_ssh_port": 2200,
"ansible_ssh_private_key_file": ".vagrant/machines/r2/virtualbox/private_key",
"ansible_ssh_user": "root"
}
My playbook is copied from the following document which I got from Here on juniper.net.
My Inventory File
[vsrx]
r1 ansible_ssh_host=127.0.0.1 ansible_ssh_port=2222 ansible_ssh_private_key_file=.vagrant/machines/r1/virtualbox/private_key
r2 ansible_ssh_host=127.0.0.1 ansible_ssh_port=2200 ansible_ssh_private_key_file=.vagrant/machines/r2/virtualbox/private_key
[vsrx:vars]
ansible_ssh_user=root
My Playbook
---
- name: show version
hosts: vsrx
roles:
- Juniper.junos
connection: local
gather_facts: no
tasks:
- name: retrieve facts
juniper_junos_facts:
host: "{{ inventory_hostname }}"
savedir: "{{ playbook_dir }}"
- name: print version
debug:
var: junos.version
As you're using connection: local you need to give the module full connection details (usually packaged in a provider dictionary at the play level to reduce repetition):
- name: retrieve facts
juniper_junos_facts:
host: "{{ ansible_ssh_host }}"
port: "{{ ansible_ssh_port }}"
user: "{{ ansible_ssh_user }}"
passwd: "{{ ansible_ssh_pass }}"
ssh_private_key_file: "{{ ansible_ssh_private_key_file }}"
savedir: "{{ playbook_dir }}"
Full docs are here (watch out for the correct role version in the URL): https://junos-ansible-modules.readthedocs.io/en/2.1.0/juniper_junos_facts.html where you can also see what the defaults are.
To fully explain the "provider" method, your playbook should look something like this:
---
- name: show version
hosts: vsrx
roles:
- Juniper.junos
connection: local
gather_facts: no
vars:
connection_info:
host: "{{ ansible_ssh_host }}"
port: "{{ ansible_ssh_port }}"
user: "{{ ansible_ssh_user }}"
passwd: "{{ ansible_ssh_pass }}"
ssh_private_key_file: "{{ ansible_ssh_private_key_file }}"
tasks:
- name: retrieve facts
juniper_junos_facts:
provider: "{{ connection_info }}"
savedir: "{{ playbook_dir }}"
- name: print version
debug:
var: junos.version
This answer for people who will find this question by error message.
If you use connection plugin different from local, it can, and usually caused by this bug related to variables ordering
Bug already fixed in Release 2.2.1 and later, try to update module from Galaxy.