I'm struggling to understand what's the intended behavior of ansible in case all hosts fail in a single play but there are other plays on other hosts in the playbook.
For example consider the following playbook:
---
- name: P1
hosts: a,b
tasks:
- name: Assert 1
ansible.builtin.assert:
that: 1==2
when: inventory_hostname != "c"
- name: P2
hosts: y,z
tasks:
- name: Debug 2
ansible.builtin.debug:
msg: 'YZ'
All 4 hosts a,b,y,z point to localhost for the sake of clarity.
What happens is assert fails and the whole playbook stops. However it seems to contradict the documentation which says that in case of an error ansible stops executing on the failed host but continues on the other hosts, see Error handling
In case I change the condition to when: inventory_hostname != 'b' and therefore b does not fail then the playbook continues to execute the second play on hosts y,z.
To me the initial failure does not seem reasonable because the hosts y,z have not experience any errors and therefore execution on them should not be prevented by the error on the other hosts.
Is this is a bug or am I missing something?
It's not a bug. It's by design (see Notes 3,4 below). As discussed in the comments to the other answer, the decision whether to terminate the whole playbook when all hosts in a play fail or not seems to be a trade-off. Either a user will have to handle how to proceed to the next play if necessary or how to stop the whole playbook if necessary. You can see in the examples below that both options require handling errors in a block approximately to the same extent.
The first case was implemented by Ansible: A playbook will terminate when all hosts in a play fail. For example,
- hosts: host01,host02
tasks:
- assert:
that: false
- hosts: host03
tasks:
- debug:
msg: Hello
PLAY [host01,host02] *************************************************************************
TASK [assert] ********************************************************************************
fatal: [host01]: FAILED! => changed=false
assertion: false
evaluated_to: false
msg: Assertion failed
fatal: [host02]: FAILED! => changed=false
assertion: false
evaluated_to: false
msg: Assertion failed
PLAY RECAP ***********************************************************************************
host01 : ok=0 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
host02 : ok=0 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
The playbook will proceed to the next play when not all hosts in a play fail. For example,
- hosts: host01,host02
tasks:
- assert:
that: false
when: inventory_hostname == 'host01'
- hosts: host03
tasks:
- debug:
msg: Hello
PLAY [host01,host02] *************************************************************************
TASK [assert] ********************************************************************************
fatal: [host01]: FAILED! => changed=false
assertion: false
evaluated_to: false
msg: Assertion failed
skipping: [host02]
PLAY [host03] ********************************************************************************
TASK [debug] *********************************************************************************
ok: [host03] =>
msg: Hello
PLAY RECAP ***********************************************************************************
host01 : ok=0 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
host02 : ok=0 changed=0 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
host03 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
To proceed to the next play when all hosts in a play fail, a user has to clear the errors, and, optionally, end the host in a play as well. For example,
- hosts: host01,host02
tasks:
- block:
- assert:
that: false
rescue:
- meta: clear_host_errors
- meta: end_host
- hosts: host03
tasks:
- debug:
msg: Hello
PLAY [host01,host02] *************************************************************************
TASK [assert] ********************************************************************************
fatal: [host01]: FAILED! => changed=false
assertion: false
evaluated_to: false
msg: Assertion failed
fatal: [host02]: FAILED! => changed=false
assertion: false
evaluated_to: false
msg: Assertion failed
TASK [meta] **********************************************************************************
TASK [meta] **********************************************************************************
TASK [meta] **********************************************************************************
PLAY [host03] ********************************************************************************
TASK [debug] *********************************************************************************
ok: [host03] =>
msg: Hello
PLAY RECAP ***********************************************************************************
host01 : ok=0 changed=0 unreachable=0 failed=0 skipped=0 rescued=1 ignored=0
host02 : ok=0 changed=0 unreachable=0 failed=0 skipped=0 rescued=1 ignored=0
host03 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Update: The playbook can't be stopped by meta end_play after this was 'fixed' in 2.12.2.
It was possible to end the whole playbook by meta *end_play* in Ansible 2.12.1. Imagine that failed all hosts in a play wouldn't terminate the whole playbook. In other words, imagine it's not implemented that way. Then, a user might want to terminate the playbook on her own. For example,
- hosts: host01,host02
tasks:
- block:
- assert:
that: false
rescue:
- meta: clear_host_errors
- set_fact:
host_failed: true
- meta: end_play
when: ansible_play_hosts_all|map('extract', hostvars, 'host_failed') is all
run_once: true
- hosts: host03
tasks:
- debug:
msg: Hello
PLAY [host01,host02] *************************************************************************
TASK [assert] ********************************************************************************
fatal: [host01]: FAILED! => changed=false
assertion: false
evaluated_to: false
msg: Assertion failed
fatal: [host02]: FAILED! => changed=false
assertion: false
evaluated_to: false
msg: Assertion failed
TASK [meta] **********************************************************************************
TASK [set_fact] ******************************************************************************
ok: [host01]
ok: [host02]
TASK [meta] **********************************************************************************
PLAY RECAP ***********************************************************************************
host01 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=1 ignored=0
host02 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=1 ignored=0
Notes
meta end_host means: 'end the play for this host'
- hosts: host01
tasks:
- meta: end_host
- hosts: host01,host02
tasks:
- debug:
msg: Hello
PLAY [host01] ********************************************************************************
TASK [meta] **********************************************************************************
PLAY [host01,host02] *************************************************************************
TASK [debug] *********************************************************************************
ok: [host01] =>
msg: Hello
ok: [host02] =>
msg: Hello
PLAY RECAP ***********************************************************************************
host01 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
host02 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
meta end_play means: 'end the playbook' (this was 'fixed' in 2.12.2. See #76672)
- hosts: host01
tasks:
- meta: end_play
- hosts: host01,host02
tasks:
- debug:
msg: Hello
PLAY [host01] ********************************************************************************
TASK [meta] **********************************************************************************
PLAY RECAP ***********************************************************************************
Quoting from #37309
If all hosts in the current play batch (fail) the play ends, this is 'as designed' behavior ... 'play batch' is 'serial size' or all hosts in play if serial is not set.
Quoting from the source
# check the number of failures here, to see if they're above the maximum
# failure percentage allowed, or if any errors are fatal. If either of those
# conditions are met, we break out, otherwise, we only break out if the entire
# batch failed
failed_hosts_count = len(self._tqm._failed_hosts) + len(self._tqm._unreachable_hosts) - \
(previously_failed + previously_unreachable)
if len(batch) == failed_hosts_count:
break_play = True
break
A playbook with multiple play is just sequential, it cannot know in front that you are going to have any other hosts in a later play.
Because your assert task, in the first play, has exhausted all hosts of the play, it makes sense that the playbook stops there, as it won't have anything to do on any further tasks in P1, and remember, it doesn't know anything about P2 yet, so it just end there.
I have a playbook that only calls roles. This is what it looks like: (there are about 20 roles in it)
---
- hosts: prod1234
roles:
- role1
- role2
- role3
Sometimes, a role fails, and I don't want to start over as each role is huge and I would just like to start at that point or the next one.
With tasks, I know there's a flag for --start-at-task="task-name". Is there something similar I can do with roles?
My current solution is to comment out all the lines I don't need and run it again..
Thanks ahead!~~
Quick n dirty solution. The following roleimport.yml playbook
# Note you will have to implement error management (e.g. you give a role that does not exist).
- name: quickNdirty roles start demo
hosts: localhost
gather_facts: false
vars:
full_role_list:
- myfirstrole
- mysecondrole
- thirdone
- next
- last
# We calculate a start index for the above list. By default it will be 0.
# Unless we pass `start_role` var in extra_vars: the index will be set
# to that element
start_index: "{{ full_role_list.index(start_role|default('myfirstrole')) }}"
# We slice the list from start_index to end
current_role_list: "{{ full_role_list[start_index|int:] }}"
tasks:
# Real task commented out for this example
# - name: Import selected roles in order
# import_role:
# name: "{{ item }}"
# loop: "{{ current_role_list }}"
- name: Debug roles that would be used in above commented task
debug:
msg: "I would import role {{ item }}"
loop: "{{ current_role_list }}"
gives:
$ ansible-playbook roleimport.yml
PLAY [quickNdirty roles start demo] *******************************************************************************************************************************************************************************************
TASK [Debug roles that would be used in above commented task] *****************************************************************************************************************************************************************
ok: [localhost] => (item=myfirstrole) => {
"msg": "I would import role myfirstrole"
}
ok: [localhost] => (item=mysecondrole) => {
"msg": "I would import role mysecondrole"
}
ok: [localhost] => (item=thirdone) => {
"msg": "I would import role thirdone"
}
ok: [localhost] => (item=next) => {
"msg": "I would import role next"
}
ok: [localhost] => (item=last) => {
"msg": "I would import role last"
}
PLAY RECAP ********************************************************************************************************************************************************************************************************************
localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
$ ansible-playbook roleimport.yml -e start_role=thirdone
PLAY [quickNdirty roles start demo] *******************************************************************************************************************************************************************************************
TASK [Debug roles that would be used in above commented task] *****************************************************************************************************************************************************************
ok: [localhost] => (item=thirdone) => {
"msg": "I would import role thirdone"
}
ok: [localhost] => (item=next) => {
"msg": "I would import role next"
}
ok: [localhost] => (item=last) => {
"msg": "I would import role last"
}
PLAY RECAP ********************************************************************************************************************************************************************************************************************
localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
$ ansible-playbook roleimport.yml -e start_role=last
PLAY [quickNdirty roles start demo] *******************************************************************************************************************************************************************************************
TASK [Debug roles that would be used in above commented task] *****************************************************************************************************************************************************************
ok: [localhost] => (item=last) => {
"msg": "I would import role last"
}
PLAY RECAP ********************************************************************************************************************************************************************************************************************
localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
# Implement error management yourself if you need it.
$ ansible-playbook roleimport.yml -e start_role=thisIsAnError
PLAY [quickNdirty roles start demo] *******************************************************************************************************************************************************************************************
TASK [Debug roles that would be used in above commented task] *****************************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"msg": "An unhandled exception occurred while templating '{{ full_role_list[start_index|int:] }}'. Error was a <class 'ansible.errors.AnsibleError'>, original message: An unhandled exception occurred while templating '{{ full_role_list.index(start_role|default('myfirstrole')) }}'. Error was a <class 'ValueError'>, original message: 'thisIsAnError' is not in list"}
PLAY RECAP ********************************************************************************************************************************************************************************************************************
localhost : ok=0 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
How to run the playbook on a specific set of hosts with a conditional variable
vars_file.yml
deployment: blue
hosts_file.yml
[east1]
127.0.0.1
127.0.0.2
[west2]
127.0.0.3
127.0.0.4
playbook.yml
---
hosts: all
vars_files:
- 'vars_file.yml'
tasks:
- copy: src=config dest=/tmp/
hosts: {{ east1[0] if deployment == "blue" else west2[0]}}
vars_files:
- 'vars_file.yml'
tasks:
- shell: "./startup_script restart"
Note: I cant pass variables through the command line and I cant segregate task to a new playbook.
You can access variables defined on another host by targeting the hostvars dictionary key of that host.
In order to do that though, you need to register the variable on the host, with set_fact, importing it won't be enough.
Here is an example, given the inventory:
all:
children:
east1:
hosts:
east_node_1:
ansible_host: node1
east_node_2:
ansible_host: node4
west2:
hosts:
west_node_1:
ansible_host: node2
west_node_2:
ansible_host: node3
And the playbook:
- hosts: localhost
gather_facts: no
vars_files:
- vars_file.yml
tasks:
- set_fact:
deployment: "{{ deployment }}"
- hosts: >-
{{ groups['east1'][0]
if hostvars['localhost'].deployment == 'blue'
else groups['west2'][0]
}}
gather_facts: no
tasks:
- debug:
This would yield the recaps:
PLAY [localhost] *******************************************************************************************************************
TASK [set_fact] ********************************************************************************************************************
ok: [localhost]
PLAY [east_node_1] *****************************************************************************************************************
TASK [debug] ***********************************************************************************************************************
ok: [east_node_1] => {
"msg": "Hello world!"
}
PLAY RECAP *************************************************************************************************************************
east_node_1 : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
localhost : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
when vars_file.yml contains
deployment: blue
And
PLAY [localhost] *******************************************************************************************************************
TASK [set_fact] ********************************************************************************************************************
ok: [localhost]
PLAY [west_node_1] *****************************************************************************************************************
TASK [debug] ***********************************************************************************************************************
ok: [west_node_1] => {
"msg": "Hello world!"
}
PLAY RECAP *************************************************************************************************************************
west_node_1 : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
localhost : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
when vars_file.yml contains
deployment: red
Another equivalent construction, using patterns to target the group, would just see the host target changed to:
- hosts: >-
{{ 'east1[0]'
if hostvars['localhost'].deployment == 'blue'
else 'west2[0]'
}}
Here is my playbook that builds a dynamic inventory using add_host:
---
- name: "Play 1"
hosts: localhost
gather_facts: no
tasks:
- name: "Search database"
command: > mysql --user=root --password=p#ssword deployment
--host=localhost -Ns -e "SELECT dest_ip,username FROM deploy_dets"
register: command_result
- name: Add hosts
add_host:
name: "{{ item.split('\t')[0] }}"
ansible_user: "{{ item.split('\t')[1] }}"
groups: dest_nodes
with_items: "{{ command_result.stdout_lines }}"
- hosts: dest_nodes
gather_facts: false
tasks:
- debug:
msg: Run the shell script with the arguments `{{ ansible_user }}` here"
The Output is good and as expected when the 'name:' attribute of add_host are of different values IPs viz '10.9.0.100' & '10.8.2.144'
$ ansible-playbook duplicate_hosts.yml
PLAY [Play 1] ***********************************************************************************************************************************************
TASK [Search database] **************************************************************************************************************************************
changed: [localhost]
TASK [Add hosts] ********************************************************************************************************************************************
changed: [localhost] => (item=10.9.0.100 user1)
changed: [localhost] => (item=10.8.2.144 user2)
PLAY [dest_nodes] *******************************************************************************************************************************************
TASK [debug] ************************************************************************************************************************************************
ok: [10.9.0.100] => {
"msg": "Run the shell script with the arguments `user1` here\""
}
ok: [10.8.2.144] => {
"msg": "Run the shell script with the arguments `user2` here\""
}
PLAY RECAP **************************************************************************************************************************************************
10.8.2.144 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
10.9.0.100 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
localhost : ok=2 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
The problem is when the 'name:' attribute for add_host gets duplicate entry say 10.8.2.144 despite having unique 'ansible_user' value the play ignores the first name, ansible_user entry and runs only once with the latest final entry.
$ ansible-playbook duplicate_hosts.yml
PLAY [Play 1] ***********************************************************************************************************************************************
TASK [Search database] **************************************************************************************************************************************
changed: [localhost]
TASK [Add hosts] ********************************************************************************************************************************************
changed: [localhost] => (item=10.8.2.144 user1)
changed: [localhost] => (item=10.8.2.144 user2)
PLAY [dest_nodes] *******************************************************************************************************************************************
TASK [debug] ************************************************************************************************************************************************
ok: [10.8.2.144] => {
"msg": "Run the shell script with the arguments `user2` here\""
}
PLAY RECAP **************************************************************************************************************************************************
10.8.2.144 : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
localhost : ok=2 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Interestingly the debug shows two entries for add_host name: 10.8.2.144 with the different ansible_users i.e 'user1' and 'user2' but when we run the group it runs just the single and latest name entry and seen in the output above.
I'm on the latest version of ansible.
Can you please provide some solution where i can run the play for every unique 'ansible_user' on the same host ?
In summary: I wish to run multiple tasks on the same host first with 'user1' and then with 'user2'
You can add a alias as inventory hostname. Here I have given the username as hostname(alias).
Please try this, I have not tested it.
- name: Add hosts
add_host:
hostname: "{{ item.split('\t')[1] }}"
ansible_host: "{{ item.split('\t')[0] }}"
ansible_user: "{{ item.split('\t')[1] }}"
groups: dest_nodes
with_items: "{{ command_result.stdout_lines }}"
I want to continue my playbook to do some other tasks when some hosts are unreachable. However, the ignore_errors seems doesn't work. The debug msg is not printed.
ansible version is 2.5.4. Is there a way to do this in this version?
- name: check accessibility
hosts: myhosts
tasks:
- ping:
ignore_errors: yes
- fail:
msg: "Host {{ansible_hostname}} is not accessible"
when: False
An option would be to ping each 'inventory_hostname' in the block and end the play if the ping fails.
- hosts: myhosts
gather_facts: no
tasks:
- block:
- delegate_to: localhost
command: ping -c1 "{{ inventory_hostname }}"
rescue:
- fail:
msg: "{{ inventory_hostname }} not accessible. End of play."
- debug:
msg: "Host {{ inventory_hostname }} continue play."
- setup:
Notes:
Set 'gather_facts: no', because we are not sure all hosts are available
Use 'inventory_hostname', because of 'gather_facts: no'
use 'setup' module after the 'block' if necessary
Running the playbook with available hosts: test_01, test_02, test_03 and unavailable host test_99 gives (abridged):
TASK [fail]
fatal: [test_99]: FAILED! => {"changed": false, "msg": "test_99 not accessible. End of play."}
TASK [debug]
ok: [test_03] => {
"msg": "Host test_03 continue play."
}
ok: [test_01] => {
"msg": "Host test_01 continue play."
}
ok: [test_02] => {
"msg": "Host test_02 continue play."
}
TASK [setup]
ok: [test_03]
ok: [test_01]
ok: [test_02]
PLAY RECAP
test_01 : ok=3 changed=1 unreachable=0 failed=0
test_02 : ok=3 changed=1 unreachable=0 failed=0
test_03 : ok=3 changed=1 unreachable=0 failed=0
test_99 : ok=0 changed=0 unreachable=0 failed=2