I have created an Ansible playbook which just checks the connection to target hosts on a specific port.
Here is my playbook
- name: ACL check to target machines.
hosts: all
gather_facts: False
vars:
target_hosts:
- server1 18089
- server1 8089
- server1 18000
tasks:
- name: execute the command
command: "nc -vz {{ item }}"
with_items: "{{ target_hosts }}"
The output i get when i execute the playbook contains both changed(success) and failed.
In real scenario i have many number of target hosts, & my output is very large.
What i wanted here is, i want to have final report at bottom which shows only list of all failed connections between source and target.
Thanks
i want to have final report at bottom which shows only list of all failed connections between source and target
I believe the knob you are looking for is stdout callback plugins. While I don't see one that does exactly as you wish, there are two of them that seem like they may get you close:
The actionable one claims it will only emit Failed and Changed events:
$ ANSIBLE_STDOUT_CALLBACK=actionable ansible-playbook ...
Then, moving up the complexity ladder, the json one will, as its name implies, emit a record of the steps in JSON, allowing you to filter the output for exactly what you want (.failed is a boolean on each task that indicates just that):
$ ANSIBLE_STDOUT_CALLBACK=json ansible-playbook ...
Then, as the "plugins" part implies, if you were so inclined you could also implement your own that does exactly what you want; there are a lot of examples provided, in addition to the docs.
I have a similar playbook and want the same list of failed connection, what I do is:
Save the output command register: nc
Add the ignore_errors: true
In the debug part show the ok and failed connections asking for the return code.
My playbook:
---
- name: test-connection
gather_facts: false
hosts: "{{ group_hosts }}"
tasks:
- name: test connection
shell: "nc -w 5 -zv {{ inventory_hostname }} {{ listen_port }}"
register: nc
ignore_errors: true
- name: ok connections
debug: var=nc.stderr
when: "nc.rc == 0"
- name: failed connections
debug: var=nc.stderr
when: "nc.rc != 0"
An output example for the failed connections:
TASK [failed connections] **********************************************************************************************
ok: [10.1.2.100] => {
"nc.stderr": "nc: connect to 10.1.2.100 port 22 (tcp) timed out: Operation now in progress"
}
ok: [10.1.2.101] => {
"nc.stderr": "nc: connect to 10.1.2.101 port 22 (tcp) timed out: Operation now in progress"
}
ok: [10.1.2.102] => {
"nc.stderr": "nc: connect to 10.1.2.102 port 22 (tcp) timed out: Operation now in progress"
The answer by #mdaniel on 2018 Aug 18 is unfortunately no longer correct as the 'actionable' callback has been deprecated:
community.general.actionable has been removed. Use the 'default' callback plugin with 'display_skipped_hosts = no' and 'display_ok_hosts = no' options. This feature was removed from community.general in version 2.0.0. Please update your playbooks.
We are to use the default callback with some parameters instead.
I used this in my ansible.cfg
stdout_callback = default
display_skipped_hosts = no
display_ok_hosts = no
It worked for the live output but still showed all the ok and skipped hosts in the summary at the end.
Also, $ DISPLAY_SKIPPED_HOSTS=no DISPLAY_OK_HOSTS=no ansible-playbook ... didn't seem to work which is unfortunate because I prefer the yaml callback normally and only occasionally want to override this with the above.
Related
I'm trying to create a playbook which basically consists 2 hosts init; (don't ask why)
---
- hosts: all
tasks:
- name: get the hostname of machine and save it as a variable
shell: hostname
register: host_name
when: ansible_host == "x.x.x.x" *(will be filled by my application)*
- hosts: "{{ host_name.stdout }}"
tasks:
- name: use the variable as hostname
shell: whoami
I don't have any hostname information in my application so I need to trigger my playbook with an IP address, then i should get the hostname of that machine and save it to a variable to use in my other tasks to avoid "when" command for each task.
The problem is that I'm able to use "host_name" variable in all other fields except "hosts", it gives me an Error like this when i try to run;
ERROR! The field 'hosts' has an invalid value, which includes an undefined variable. The error was: 'host_name' is undefined
Screenshot of the error
By default, Ansible itself gathers some information about a host. This happens at the beginning of a playbook's execution right after PLAY in TASK [Gathering Facts].
This automatic gathering of information about a system can be turned off via gather_facts: no, by default this is active.
This collected information is called Ansible Facts. An example of the collected facts is shown in the Ansible Docs, for your host you can print out all Ansible Facts:
either in the playbook as a task:
- name: Print all available facts
debug:
var: ansible_facts
or via CLI as an adhoc command:
ansible <hostname> -m setup
The Ansible Facts contain values like: ansible_hostname, ansible_fqdn, ansible_domain or even ansible_all_ipv4_addresses. This is the simplest way to act with the hostname of the client.
If you want to output the hostname and IP addresses that Ansible has collected, you can do it with the following tasks for example:
- name: Print hostname
debug:
var: ansible_hostname
- name: Print IP addresses
debug:
var: ansible_all_ipv4_addresses
If you start your playbook for all hosts, you can check the IP address and also stop it directly for the "wrong" clients.
---
- hosts: all
tasks:
- name: terminate execution for wrong hosts
assert:
that: '"x.x.x.x" is in ansible_all_ipv4_addresses'
fail_msg: Terminating because IP did not match
success_msg: "Host matched. Hostname: {{ ansible_hostname }}"
# your task for desired host
One can recover failed hosts using rescue. How can I configure Ansible so that the other hosts in the play are aware of the host which will be recovered?
I thought I was smart, and tried using difference between ansible_play_hosts_all and ansible_play_batch, but Ansible doesn't list the failed host, since it's rescued.
---
- hosts:
- host1
- host2
gather_facts: false
tasks:
- block:
- name: fail one host
shell: /bin/false
when: inventory_hostname == 'host1'
# returns an empty list
- name: list failed hosts
debug:
msg: "{{ ansible_play_hosts_all | difference(ansible_play_batch) }}"
rescue:
- shell: /bin/true
"How can I configure Ansible so that the other hosts in the play are aware of the host which will be recovered?"
It seems that according documentation Handling errors with blocks
If any tasks in the block return failed, the rescue section executes tasks to recover from the error. ... Ansible provides a couple of variables for tasks in the rescue portion of a block: ansible_failed_task, ansible_failed_result
as well the source of ansible/playbook/block.py, such functionality isn't implemented yet.
You may need to implement some logic to keep track of the content of return values of ansible_failed_task and on which host it happened during execution. Maybe it is possible to use add_host module – Add a host (and alternatively a group) to the ansible-playbook in-memory inventory with parameter groups: has_rescued_tasks.
Or probably do further investigation beginning with default Callback plugin and Ansible Issue #48418 "Add stats on rescued/ignored tasks" since it added statistics about rescued tasks.
Use case: users can provide a host name and will trigger a playbook run. In case the hostname has a typo I want to fail complete playbook run when "no hosts matched". I want to fail it since I like to detect a failure im Bamboo (which I use for CD/CI) to run the playbook.
I have done quite extensive research. It seems that it is a wanted behavior that the playbook exists with an exit code = 0 when no host matches. Here is one indication I found. I agree that the general behavior should be like this.
So I need for my use case an extra check. I tried the following:
- name: Deploy product
hosts: "{{ target_hosts }}"
gather_facts: no
any_errors_fatal: true
pre_tasks:
- name: Check for a valid target host
fail:
msg: "The provided host is not knwon"
when: target_hosts not in groups.tomcat_servers
But since there is no host match the playbook will not run, that is ok but it also ends with exit code 0. That way I can not fail the run in my automation system (Bamboo).
Due to this I am looking for a solution to throw an exit code != 0 when no host matches.
Add a play which would set a fact if a host matched, then check that fact in a second play:
- name: Check hosts
hosts: "{{ target_hosts }}"
gather_facts: no
tasks:
- set_fact:
hosts_confirmed: true
delegate_to: localhost
delegate_facts: true
- name: Verify hosts
hosts: localhost
gather_facts: no
tasks:
- assert:
that: hosts_confirmed | default(false)
- name: The real play
hosts: "{{ target_hosts }}"
# ...
I am using the following Ansible playbook to shut down a list of remote Ubuntu hosts all at once:
- hosts: my_hosts
become: yes
remote_user: my_user
tasks:
- name: Confirm shutdown
pause:
prompt: >-
Do you really want to shutdown machine(s) "{{play_hosts}}"? Press
Enter to continue or Ctrl+C, then A, then Enter to abort ...
- name: Cancel existing shutdown calls
command: /sbin/shutdown -c
ignore_errors: yes
- name: Shutdown machine
command: /sbin/shutdown -h now
Two questions on this:
Is there any module available which can handle the shutdown in a more elegant way than having to run two custom commands?
Is there any way to check that the machines are really down? Or is it an anti-pattern to check this from the same playbook?
I tried something with the net_ping module but I am not sure if this is its real purpose:
- name: Check that machine is down
become: no
net_ping:
dest: "{{ ansible_host }}"
count: 5
state: absent
This, however, fails with
FAILED! => {"changed": false, "msg": "invalid connection specified, expected connection=local, got ssh"}
In more restricted environments, where ping messages are blocked you can listen on ssh port until it goes down. In my case I have set timeout to 60 seconds.
- name: Save target host IP
set_fact:
target_host: "{{ ansible_host }}"
- name: wait for ssh to stop
wait_for: "port=22 host={{ target_host }} delay=10 state=stopped timeout=60"
delegate_to: 127.0.0.1
There is no shutdown module. You can use single fire-and-forget call:
- name: Shutdown server
become: yes
shell: sleep 2 && /sbin/shutdown -c && /sbin/shutdown -h now
async: 1
poll: 0
As for net_ping, it is for network appliances such as switches and routers. If you rely on ICMP messages to test shutdown process, you can use something like this:
- name: Store actual host to be used with local_action
set_fact:
original_host: "{{ ansible_host }}"
- name: Wait for ping loss
local_action: shell ping -q -c 1 -W 1 {{ original_host }}
register: res
retries: 5
until: ('100.0% packet loss' in res.stdout)
failed_when: ('100.0% packet loss' not in res.stdout)
changed_when: no
This will wait for 100% packet loss or fail after 5 retries.
Here you want to use local_action because otherwise commands are executed on remote host (which is supposed to be down).
And you want to use trick to store ansible_host into temp fact, because ansible_host is replaced with 127.0.0.1 when delegated to local host.
I'm provisioning a new server via Terraform and using Ansible as the provisioner on my local system.
Terraform provisions a system on EC2, and then it runs the Ansible playbook providing the IP of the newly built system as the inventory.
I want to use Ansible to wait for the system to finish booting and prevent further tasks from being attempted up until a connection can be established. Up until this point I have been using a manual pause which is inconvenient and imprecise.
Ansible doesn't seem to do what the documentation says it will (unless I'm wrong, a very possible scenario). Here's my code:
- name: waiting for server to be alive
wait_for:
state: started
port: 22
host: "{{ ansible_ssh_host | default(inventory_hostname) }}"
delay: 10
timeout: 300
connect_timeout: 300
search_regex: OpenSSH
delegate_to: localhost
What happens in this step is that the connection doesn't wait any more than 10 seconds to make the connection, and it fails. If the server has booted and I try the playbook again, it works fine and performs as expected.
I've also tried do_until style loops which never seem to work. All examples given in documentation use shell output, and I don't see any way that it would work for non-shell modules.
I also can't seem to get any debug information if I try to register a result and print it out using the debug module.
Anyone have any suggestions as to what I'm doing wrong?
When you use delegate_to or local_action module, {{ ansible_ssh_host }} resolves to localhost, so your task is always running with the following parameter:
host: localhost
It waits for 10 seconds, checks the SSH connection to local host and proceeds (because most likely it is open).
If you use gather_facts: false (which I believe you do) you can add a set_fact task before, to store the target host name value in a variable:
- set_fact:
host_to_wait_for: "{{ ansible_ssh_host | default(inventory_hostname) }}"
and change the line to:
host: "{{ host_to_wait_for }}"
You can proof-test the variables with the following playbook:
---
- hosts: all
gather_facts: false
tasks:
- set_fact:
host_to_wait_for: "{{ ansible_ssh_host | default(inventory_hostname) }}"
- debug: msg="ansible_ssh_host={{ ansible_ssh_host }}, inventory_hostname={{ inventory_hostname }}, host_to_wait_for={{ host_to_wait_for }}"
delegate_to: localhost
Alternatively you can try to find a way to provide the IP address of the EC2 instance to Ansible as a variable and use it as a value for host: parameter. For example, you run Ansible from CLI, then pass ${aws_instance.example.public_ip} to --extra-vars argument.
As techraf indicates, your inventory lookup is actually grabbing the localhost address because of the delegation, so it's not running against the correct machine.
I think your best solution might be to have terraform pass a variable to the playbook containing the instance's IP address. Example:
terraform passes -e "new_ec2_host=<IP_ADDR>"
Ansible task:
- name: waiting for server to be alive
wait_for:
state: started
port: 22
host: "{{ new_ec2_host }}"
delay: 10
timeout: 300
connect_timeout: 300
search_regex: OpenSSH
delegate_to: localhost