How to ignore specific errors in an Ansible task - ansible

If have an Ansible task, that can fails sometimes, due to some error during the creation of an user account. Especially if the user account is already in use and the user is logged in. If the task fails with a specific error message, like "user account in use" the play must continue. There is no need to fail then, but only on predefined error messages. The task looks like this.
- name: modify user
user:
state: "{{ user.state | default('present') }}"
name: "{{ user.name }}"
home: "{{ user_base_path }}/{{ user.name }}"
createhome: true
Since it's not a shell command, I cannot simply register a var and check the output of .rc. Also I don't get stderr or stdout, when i register a var and print it in debug mode. That was my first approach on check for the error message. I am running out of ideas, how to filter for a specific error and passing the task, but failing on everything else. ignore_errors: yes is not a good solution, because the task should fail in some cases.

As per ansible doc we get stdout and stderr as return fields.
I would suggest to use flag ignore_errors: yes and catch the return as per this example
---
- hosts: localhost
vars:
user:
name: yash
user_base_path: /tmp
tasks:
- name: modify user
user:
state: "{{ user.state | default('present') }}"
name: "{{ user.name }}"
home: "{{ user_base_path }}/{{ user.name }}"
createhome: true
register: user_status
ignore_errors: yes
- name: stdout_test
debug:
msg: "{{ user_status.err }}"
- name: Fail on not valid
fail:
msg: failed
when: '"user account in use" not in user_status.err'
Output:
PLAY [localhost] *************************************************************************************************************************************************************************************
TASK [Gathering Facts] *******************************************************************************************************************************************************************************
ok: [localhost]
TASK [modify user] ***********************************************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "err": "<main> attribute status: eDSPermissionError\n<dscl_cmd> DS Error: -14120 (eDSPermissionError)\n", "msg": "Cannot create user \"yash\".", "out": "", "rc": 40}
...ignoring
TASK [stdout_test] ***********************************************************************************************************************************************************************************
ok: [localhost] => {
"msg": "<main> attribute status: eDSPermissionError\n<dscl_cmd> DS Error: -14120 (eDSPermissionError)\n"
}
TASK [Fail on not valid] *****************************************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "failed"}
PLAY RECAP *******************************************************************************************************************************************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=1

You can use when, save the return value by the register and set_fact and then bet what will happen according to that value.

Related

Ansible playbook is failing for 'become:yes'

The following ansible-playbook works fine for non-sudo access.
But fails when I un-comment become:yes
---
- hosts: all
become: yes
tasks:
- name: Register the policy file in a variable
read_csv:
path: policy.csv
delegate_to: localhost
register: csv_file
- name: Check rules pre-remediation
command: "{{ item.Compliance_check }}"
register: output
with_items:
"{{ csv_file.list }}"
- name: Perform remediation
command: "{{ item.item.Remediation }}"
when: item.item.Expected_result != item.stdout
with_items:
"{{ output.results }}"
The inventory file looks like:
10.136.59.110 ansible_ssh_user=username ansible_ssh_pass=password ansible_sudo_pass=password
Error I'm facing is:
user#hostname:~/git-repo/MCI$ ansible-playbook playbook.yaml -i inventory
PLAY [all] *******************************************************************************************************************************
TASK [Gathering Facts] *******************************************************************************************************************
ok: [10.136.59.109]
TASK [Register the policy file in a variable] ********************************************************************************************
Sorry, try again.
fatal: [10.136.59.109 -> localhost]: FAILED! => {"changed": false, "module_stderr": "[sudo via ansible, key=mjcqjbcyeemygkxwycgeftiikivnylsj] password:\nsudo: no password was provided\nsudo: 1 incorrect password attempt\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}
PLAY RECAP *******************************************************************************************************************************
10.136.59.109 : ok=1 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
Can't understand why this error is showing. I have tried providing correct sudo passwords in both inventory as well as at run-time using --ask-become-pass
Please help find out the cause of the error.

Create a constant from a lookup in Ansible

I am trying to create a variable from a lookup in Ansible. I then want to use that variable as a constant throughout the rest of the role. Here are a couple of lines from defaults/main.yml.
job_start_time: "{{ lookup('pipe','date +%Y%m%d_%H%M%S') }}"
new_virtual_machine_name: "{{ template_name }}-pkrclone-{{ job_start_time }}"
The result of running the role causes an error as the lookup happens again when the new_virtual_machine_name variable is used in tasks/main.yml. You can see the second increment in subsequent tasks.
TASK [update_vsphere_template : Write vsphere source file] changed: [localhost] => {"changed": true, "checksum": "c2ee7ed6fad0b26aea825233f3e714862de3ac28", "dest": "packer-DevOps-Windows2019-Base-DevOps-Windows2019-Base-pkrclone-20210819_**093122**/vsphere-source-DevOps-Windows2019-Base-DevOps-Windows2019-Base-pkrclone-20210819_**093122**.pkr.hcl", 148765781020136/source", "state": "file", "uid": 1000}
TASK [update_vsphere_template : Write the template update file] *fatal: [localhost]: FAILED! => {"changed": false, "checksum": "6fd97c1d2373b9c71ad6b5246529c46a38f55c48", "msg": "Destination directory packer-DevOps-Windows2019-Base-DevOps-Windows2019-Base-pkrclone-20210819_**093123** does not exist"}
Here is how I am referencing the new_virtual_machine_name variable in tasks/main.yml
- name: Write vsphere source file
template:
src: ../templates/vsphere-source.j2
dest: 'packer-{{template_name}}-{{ new_virtual_machine_name }}/vsphere-source-{{ template_name }}-{{ new_virtual_machine_name }}.pkr.hcl'
- name: Write the template update file
template:
src: ../templates/update-template.j2
dest: 'packer-{{template_name}}-{{ new_virtual_machine_name }}/update-{{ template_name }}-{{ new_virtual_machine_name }}.pkr.hcl'
How can I keep Ansible from running the lookup every time I reference the variable that is created from the lookup at the start of the job? Thank you.
Q: "How can I keep Ansible from running the lookup every time I reference the variable?"
A: Use strftime, you'll get the current date each time you call the filter, e.g.
- set_fact:
job_start_time: "{{ '%Y%m%d_%H%M%S'|strftime }}"
- debug:
msg: "pkrclone-{{ job_start_time }}"
- wait_for:
timeout: 3
- debug:
msg: "pkrclone-{{ job_start_time }}"
- set_fact:
job_start_time: "{{ '%Y%m%d_%H%M%S'|strftime }}"
- debug:
msg: "pkrclone-{{ job_start_time }}"
gives
msg: pkrclone-20210819_174748
msg: pkrclone-20210819_174748
msg: pkrclone-20210819_174753
Yes, variables are evaluated every time you call them. A way to work around this is to store a fact using set_facts which will be evaluated only at time it is set and stored as a static values for the host.
Edit: Note that, as pointed out in #vladimir's answer (and discussed in comments), using the pipe lookup to get such an info is not really a good practice when you have the strftime filter available in core Ansible. So I changed it in my below example.
To illustrate, the following playbook:
---
- hosts: localhost
gather_facts: false
vars:
_job_start_time: "{{ '%Y%m%d_%H%M%S'| strftime }}"
tasks:
- name: Force var into a fact that will be constant throughout play
set_fact:
job_start_time: "{{ _job_start_time }}"
- name: Wait a bit making sure time goes on
pause:
seconds: 1
- name: Show that fact is constant whereas inital var changes
debug:
msg: "var is: {{ _job_start_time }}, fact is: {{ job_start_time }}"
- name: Wait a bit making sure time goes on
pause:
seconds: 1
- name: Show that fact is constant whereas inital var changes
debug:
msg: "var is: {{ _job_start_time }}, fact is: {{ job_start_time }}"
Gives:
PLAY [localhost] ***********************************************************************************************************************************************************************************************************************
TASK [Force var into a fact that will be constant throughout play] *********************************************************************************************************************************************************************
ok: [localhost]
TASK [Wait a bit making sure time goes on] *********************************************************************************************************************************************************************************************
Pausing for 1 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [localhost]
TASK [Show that fact is constant whereas inital var changes] ***************************************************************************************************************************************************************************
ok: [localhost] => {
"msg": "var is: 20210819_172528, fact is: 20210819_172527"
}
TASK [Wait a bit making sure time goes on] *********************************************************************************************************************************************************************************************
Pausing for 1 seconds
(ctrl+C then 'C' = continue early, ctrl+C then 'A' = abort)
ok: [localhost]
TASK [Show that fact is constant whereas inital var changes] ***************************************************************************************************************************************************************************
ok: [localhost] => {
"msg": "var is: 20210819_172529, fact is: 20210819_172527"
}
PLAY RECAP *****************************************************************************************************************************************************************************************************************************
localhost : ok=5 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

How do I handle rollback in case of failure using handlers in Ansible?

I wrote below yml file which will install the SSM and cloudwatch agent but I want to rollback the installation in case of any failures during the installation. I tried use FAIL but not working..Please advise..
---
# tasks file for SSMAgnetInstall
- name: status check
command: systemctl status amazon-ssm-agent
register: s_status
- debug:
msg: "{{ s_status }}"
- name: Get CPU architecture
command: getconf LONG_BIT
register: cpu_arch
changed_when: False
check_mode: no
when: s_status.stdout == ""
ignore_errors: true
- name: Install rpm file for Redhat Family (Amazon Linux, RHEL, and CentOS) 32/64-bit
yum:
name: "https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_386/amazon-ssm-agent.rpm"
state: present
when: s_status.stdout == ""
become: yes
ignore_errors: true
- name: cloud status check
command: systemctl status amazon-cloudwatch-agent
register: cld_status
become: yes
- debug:
msg: "{{ cld_status }}"
- name: Register to cloud watch service
become: yes
become_user: root
service:
name: amazon-ssm-agent
enabled: yes
state: started
- name: copy the output to a local file
copy:
content: "{{ myshell_output.stdout }}"
dest: "/home/ansible/rama/output.txt"
delegate_to: localhost
You should have a look at the the documentation on blocks, more specifically the error handling part. This is the general idea with an oversimplified example, you will have to adapt to your specific case.
The test.yml playbook
---
- hosts: localhost
gather_facts: false
tasks:
- block:
- name: I am a task that can fail
debug:
msg: "I {{ gen_fail | default(false) | bool | ternary('failed', 'succeeded') }}"
failed_when: gen_fail | default(false) | bool
- name: I am a task that will never fail
debug:
msg: I succeeded
rescue:
- name: I am a task in a block played when a failure happens
debug:
msg: rescue task
always:
- name: I am a task always played whatever happens
debug:
msg: always task
Played normally (no fail)
$ ansible-playbook test.yml
PLAY [localhost] ************************************************************************
TASK [I am a task that can fail] ********************************************************
ok: [localhost] => {
"msg": "I succeeded"
}
TASK [I am a task that will never fail] *************************************************
ok: [localhost] => {
"msg": "I succeeded"
}
TASK [I am a task always played whatever happens] ***************************************
ok: [localhost] => {
"msg": "always task"
}
PLAY RECAP ******************************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Played forcing a fail
$ ansible-playbook test.yml -e gen_fail=true
PLAY [localhost] ************************************************************************
TASK [I am a task that can fail] ********************************************************
fatal: [localhost]: FAILED! => {
"msg": "I failed"
}
TASK [I am a task in a block played when a failure happens] *****************************
ok: [localhost] => {
"msg": "rescue task"
}
TASK [I am a task always played whatever happens] ***************************************
ok: [localhost] => {
"msg": "always task"
}
PLAY RECAP ******************************************************************************
localhost : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=1 ignored=0

Using Ansible to delete old usernames across entire company network devices

Created below test yml file against test switches to nail down configs, error below. I defined provider in last task with no luck as well
---
- hosts: aus2-mdf-testswitches
gather_facts: no
connection: local
tasks:
- name: OBTAIN LOGIN CREDENTIALS
include_vars: secret.yml
- name: DEFINE PROVIDER
set_fact:
provider:
host: "{{ inventory_hostname }}"
username: "{{ creds['username'] }}"
password: "{{ creds['password'] }}"
auth_pass: "{{ creds['auth_pass'] }}"
- name: Delete users with aggregate
ios_user:
aggregate:
- name: chase
state: absent
Error that was presented. Please keep in mind that I am new with ansible and this problem might be super easy for this group but I appreciate any help. FYI, reading from https://docs.ansible.com/ansible/2.4/ios_user_module.html
[ansible#dc1netansible automation]$ ansible-playbook -i inventories/prod/hosts playbooks/deleteUsername.yml
PLAY [aus2-mdf-testswitches] ********************************************************************************************************************************************
TASK [OBTAIN LOGIN CREDENTIALS] *****************************************************************************************************************************************
ok: [aus2-mdf-testsw1]
ok: [aus2-mdf-testsw2]
TASK [DEFINE PROVIDER] **************************************************************************************************************************************************
ok: [aus2-mdf-testsw1]
ok: [aus2-mdf-testsw2]
TASK [Delete users with aggregate] **************************************************************************************************************************************
fatal: [aus2-mdf-testsw1]: FAILED! => {"changed": false, "msg": "unable to open shell. Please see: https://docs.ansible.com/ansible/network_debug_troubleshooting.html#unable-to-open-shell"}
fatal: [aus2-mdf-testsw2]: FAILED! => {"changed": false, "msg": "unable to open shell. Please see: https://docs.ansible.com/ansible/network_debug_troubleshooting.html#unable-to-open-shell"}
to retry, use: --limit #/home/ansible/automation/playbooks/deleteUsername.retry
PLAY RECAP **************************************************************************************************************************************************************
aus2-mdf-testsw1 : ok=2 changed=0 unreachable=0 failed=1
aus2-mdf-testsw2 : ok=2 changed=0 unreachable=0 failed=1
****updated error with new yml config****
---
- hosts: aus2-mdf-testswitches
gather_facts: no
connection: local
tasks:
- name: OBTAIN LOGIN CREDENTIALS
include_vars: secret.yml
- name: DEFINE PROVIDER
set_fact:
provider:
host: "{{ inventory_hostname }}"
username: "{{ creds['username'] }}"
password: "{{ creds['password'] }}"
auth_pass: "{{ creds['auth_pass'] }}"
- name: Delete users with aggregate
ios_user:
users:
- name: chase
authorize: yes
provider: "{{ provider }}"
state: absent
register: result
[ansible#dc1netansible automation]$ ansible-playbook -i inventories/prod/hosts playbooks/deleteUsername.yml
PLAY [aus2-mdf-testswitches] ********************************************************************************************************************************************
TASK [OBTAIN LOGIN CREDENTIALS] *****************************************************************************************************************************************
ok: [aus2-mdf-testsw1]
ok: [aus2-mdf-testsw2]
TASK [DEFINE PROVIDER] **************************************************************************************************************************************************
ok: [aus2-mdf-testsw1]
ok: [aus2-mdf-testsw2]
TASK [Delete users with aggregate] **************************************************************************************************************************************
fatal: [aus2-mdf-testsw1]: FAILED! => {"changed": false, "msg": "unable to open shell. Please see: https://docs.ansible.com/ansible/network_debug_troubleshooting.html#unable-to-open-shell"}
fatal: [aus2-mdf-testsw2]: FAILED! => {"changed": false, "msg": "unable to open shell. Please see: https://docs.ansible.com/ansible/network_debug_troubleshooting.html#unable-to-open-shell"}
to retry, use: --limit #/home/ansible/automation/playbooks/deleteUsername.retry
PLAY RECAP **************************************************************************************************************************************************************
aus2-mdf-testsw1 : ok=2 changed=0 unreachable=0 failed=1
aus2-mdf-testsw2 : ok=2 changed=0 unreachable=0 failed=1
Could be my IOS version is too old, as I am using 12x train on a Cisco switch. Ansible mentions this is tested on the 15x train.
****last update****
PLAY [aus2-mdf-testswitches] ********************************************************************************************************************************************
TASK [OBTAIN LOGIN CREDENTIALS] *****************************************************************************************************************************************
ok: [aus2-mdf-testsw1]
ok: [aus2-mdf-testsw2]
TASK [DEFINE PROVIDER] **************************************************************************************************************************************************
ok: [aus2-mdf-testsw1]
ok: [aus2-mdf-testsw2]
TASK [Delete users with aggregate] **************************************************************************************************************************************
fatal: [aus2-mdf-testsw2]: FAILED! => {"changed": false, "msg": "unable to retrieve current config", "stderr": "show running-config | section username\r\n ^\r\n% Invalid input detected at '^' marker.\r\n\r\naus2-mdf-testsw2#", "stderr_lines": ["show running-config | section username", " ^", "% Invalid input detected at '^' marker.", "", "aus2-mdf-testsw2#"]}
fatal: [aus2-mdf-testsw1]: FAILED! => {"changed": false, "msg": "unable to retrieve current config", "stderr": "show running-config | section username\r\n ^\r\n% Invalid input detected at '^' marker.\r\n\r\naus2-mdf-testsw1#", "stderr_lines": ["show running-config | section username", " ^", "% Invalid input detected at '^' marker.", "", "aus2-mdf-testsw1#"]}
to retry, use: --limit #/home/ansible/automation/playbooks/deleteUsername.retry
Configs listed here do not work on the IOS I have on my Cisco switch.

ansible ios_command timeout when doing "show conf" on cisco 3850

I've got a simple ansible playbook that works fine on most ios devices. It fails on some of my 3850 switches with what looks like a timeout when doing a "show conf". How do I specify a longer, non-default timeout for command completion with the ios_command module (and presumably also ios_config)?
Useful details:
Playbook:
---
- hosts: ios_devices
gather_facts: no
connection: local
tasks:
- name: OBTAIN LOGIN CREDENTIALS
include_vars: secrets.yaml
- name: DEFINE PROVIDER
set_fact:
provider:
host: "{{ inventory_hostname }}"
username: "{{ creds['username'] }}"
password: "{{ creds['password'] }}"
- name: LIST NAME SERVERS
ios_command:
provider: "{{ provider }}"
commands: "show run | inc name-server"
register: dns_servers
- debug: var=dns_servers.stdout_lines
successful run:
$ ansible-playbook listnameserver.yaml -i inventory/onehost
PLAY [ios_devices] *****************************************************************************************************************
TASK [OBTAIN LOGIN CREDENTIALS] ****************************************************************************************************
ok: [iosdevice1.example.com]
TASK [DEFINE PROVIDER] *************************************************************************************************************
ok: [iosdevice1.example.com]
TASK [LIST NAME SERVERS] ***********************************************************************************************************
ok: [iosdevice1.example.com]
TASK [debug] ***********************************************************************************************************************
ok: [iosdevice1.example.com] => {
"dns_servers.stdout_lines": [
[
"ip name-server 10.1.1.166",
"ip name-server 10.1.1.168"
]
]
}
PLAY RECAP *************************************************************************************************************************
iosdevice1.example.com : ok=4 changed=0 unreachable=0 failed=0
unsuccessful run:
$ ansible-playbook listnameserver.yaml -i inventory/onehost
PLAY [ios_devices] *****************************************************************************************************************
TASK [OBTAIN LOGIN CREDENTIALS] ****************************************************************************************************
ok: [iosdevice2.example.com]
TASK [DEFINE PROVIDER] *************************************************************************************************************
ok: [iosdevice2.example.com]
TASK [LIST NAME SERVERS] ***********************************************************************************************************
fatal: [iosdevice2.example.com]: FAILED! => {"changed": false, "msg": "timeout trying to send command: show run | inc name-server", "rc": 1}
to retry, use: --limit #/home/sample/ansible-playbooks/listnameserver.retry
PLAY RECAP *************************************************************************************************************************
iosdevice2.example.com : ok=2 changed=0 unreachable=0 failed=1
The default timeout is 10 seconds if the request takes longer than this ios_command will fail.
You can add the timeout as a key in the provider variable, like this:
- name: DEFINE PROVIDER
set_fact:
provider:
host: "{{ inventory_hostname }}"
username: "{{ creds['username'] }}"
password: "{{ creds['password'] }}"
timeout: 30
If you've already got a timeout value in provider here's a handy way to update only that key in the variable.
- name: Update existing provider timeout key
set_fact:
provider: "{{ provider | combine( {'timeout': '180'} ) }}"

Resources