Ansible continue playbook after connection lost - ansible

I have a playbook like below,
- name: Executing shell script
shell: |
cd "{{ mntout.stdout }}"
sh config_script -f
register: installo
ignore_errors: yes
- name: Formatting output
shell: echo "{{ installo.stdout }}" | sed -r "s/\x1B\[([0-9]{1,3}(;[0-9]{1,2})?)?[mGK]//g"
register: trout
delegate_to: localhost
- name: Show output
debug:
msg: "{{ trout.stdout | replace('\r','\n')|replace('\n','\n') | replace('\b','') }}"
delegate_to: localhost
So, Once the config script completes it will reboot the target(And I dont want to wait for target to come up).
But, I want the playbook to continue after connection lost and execute remaining tasks on localhost as I need to print the output of the script. Any suggestions?
Need to continue even after below error
fatal: [147.234.158.192]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ", "unreachable": true}
TIA

I'm not sure where in your playbook the target machine reboots. The best you can do is to let Ansible reboot the target machine with the reboot module. Especially since you're expecting the machine to reboot.
- name: Reboot a slow machine that might have lots of updates to apply
reboot:
reboot_timeout: 3600

Related

Ansible module lineinfile doesn't properly write all the output

I've written a small playbook to run the sudo /usr/sbin/dmidecode -t1 | grep -i vmware | grep -i product command and write the output in a result file by usign the following code as a .yml:
# Check if server is vmware
---
- name: Check if server is vmware
hosts: all
become: yes
#ignore_errors: yes
gather_facts: False
serial: 50
#become_flags: -i
tasks:
- name: Run uptime command
#become: yes
shell: "sudo /usr/sbin/dmidecode -t1 | grep -i vmware | grep -i product"
register: upcmd
- debug:
msg: "{{ upcmd.stdout }}"
- name: write to file
lineinfile:
path: /home/myuser/ansible/mine/vmware.out
create: yes
line: "{{ inventory_hostname }};{{ upcmd.stdout }}"
delegate_to: localhost
#when: upcmd.stdout != ""
When running the playbook against a list of hosts I get different weird results so even if the debug shows the correct output, when I check the /home/myuser/ansible/mine/vmware.out file I see only part of them being present. Even weirder is that if I run the playbook again, I will correctly populate the whole list but only if I run this twice. I have repeated this several times with some minor tweaks but not getting the expected result. Doing -v or -vv shows nothing unusual.
You are writing to the same file in parallel on localhost. I suspect you're hitting a write concurrency issue. Try the following and see if it fixes your problem:
- name: write to file
lineinfile:
path: /home/myuser/ansible/mine/vmware.out
create: yes
line: "{{ host }};{{ hostvars[host].upcmd.stdout }}"
delegate_to: localhost
run_once: true
loop: "{{ ansible_play_hosts }}"
loop_control:
loop_var: host
From your described case I understand that you like to find out "How to check if a server is virtual?"
The information will already be collected by the setup module.
---
- hosts: linux_host
become: false
gather_facts: true
tasks:
- name: Show Gathered Facts
debug:
msg: "{{ ansible_facts }}"
For an under MS Hyper-V virtualized Linux system, the output could contain
...
bios_version: Hyper-V UEFI Release v1.0
...
system_vendor: Microsoft Corporation
uptime_seconds: 2908494
...
userspace_architecture: x86_64
userspace_bits: '64'
virtualization_role: guest
virtualization_type: VirtualPC
and having already the uptime in seconds included
uptime
... up 33 days ...
For just only a virtual check one could gather_subset resulting into a full output of
gather_subset:
- '!all'
- '!min'
- virtual
module_setup: true
virtualization_role: guest
virtualization_type: VirtualPC
By Caching facts
... you have access to variables and information about all hosts even when you are only managing a small number of servers
on your Ansible Control Node. In ansible.cfg you can configure where and how they are stored and for how long.
fact_caching = yaml
fact_caching_connection = /tmp/ansible/facts_cache
fact_caching_timeout = 86400 # seconds
This would be a minimal and simple solution without re-implementing functionality which is already there.
Further Documentation and Q&A
Ansible facts
What is the exact list of Ansible setup min?

How to use a variable defined in a previous task for use in a task where the conditional omits that host?

Effectively, I have two servers, and I am trying to get use output from the command of one of them, to configure the other, and vise versa. I spent a few hours reading on this, and found out that the hostvars process and dummy hosts is seemingly what I want. No matter how I try to implement this process, I still get undefined variables, and/or failures from the host(s) not being in the pattern for the task:
Here is the relevant block, with only hosts mux-ds1 and mux-ds2 are in the dispatchers group:
---
- name: Play that sets up the sql database during the build process on all mux dispatchers.
hosts: mux_dispatchers
remote_user: ansible
vars:
ansible_ssh_pipelining: yes
tasks:
- name: Check and save database master bin log file and position on mux-ds2.
shell: sudo /usr/bin/mysql mysql -e "show master status \G" | grep -E 'File:|Position:' | cut -d{{':'}} -f2 | awk '{print $1}'
become: yes
become_method: sudo
register: syncds2
when: ( inventory_hostname == 'mux-ds2' )
- name: Print current ds2 database master bin log file.
debug:
var: "syncds2.stdout_lines[0]"
- name: Print current ds2 database master bin position.
debug:
var: "syncds2.stdout_lines[1]"
- name: Add mux-ds2 some variables to a dummy host allowing us to use these variables on mux-ds1.
add_host:
name: "ds2_bin"
bin_20: "{{ syncds2.stdout_lines }}"
- debug:
var: "{{ hostvars['ds2_bin']['bin_21'] }}"
- name: Compare master bin variable output for ds1's database and if different, configure for it.
shell: sudo /usr/bin/mysql mysql -e "stop slave; change master to master_log_file='"{{ hostvars['ds2_bin']['bin_21'][0] }}"', master_log_pos="{{ hostvars['ds2_bin']['bin_21'][1] }}"; start slave"
become: yes
become_method: sudo
register: syncds1
when: ( inventory_hostname == 'mux-ds1' )
Basically everything works properly up to where I try to see the value of the variable from the dummy host with the debug module, but it tells me the variable is still undefined even though it's defined in the original variable. This is supposed to be the system to get around such problems:
TASK [Print current ds2 database master bin log file.] **************************************************
ok: [mux-ds1] => {
"syncds2.stdout_lines[0]": "VARIABLE IS NOT DEFINED!"
}
ok: [mux-ds2] => {
"syncds2.stdout_lines[0]": "mysql-bin.000001"
}
TASK [Print current ds2 database master bin position.] **************************************************
ok: [mux-ds1] => {
"syncds2.stdout_lines[1]": "VARIABLE IS NOT DEFINED!"
}
ok: [mux-ds2] => {
"syncds2.stdout_lines[1]": "107"
}
The above works as I intend, and has the variables populated and referenced properly for mux-ds2.
TASK [Add mux-ds2 some variables to a dummy host allowing us to use these variables on mux-ds1.] ********
fatal: [mux-ds1]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'stdout_lines'\n\nThe error appears to be in '/home/ansible/ssn-project/playbooks/i_mux-sql-config.yml': line 143, column 8, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n - name: Add mux-ds2 some variables to a dummy host allowing us to use these variables on mux-ds1.\n ^ here\n"}
This is where the issue is, the variable seems to be magically undefined again, which is odd given this process is designed to end-run that issue. I can't even make it to the second set of debug tasks.
Note that this is ultimately for the purpose of syncing up two master/master replication mysql databases. I'm also doing this with the shell module because the mysql version which must be used can be no higher than 5.8, and the ansible module requires 5.9, which is a shame. There same process will be done for mux-ds2 in reverse as well assuming this can be made to work.
Either I'm making a mistake in this implementation which keeps it from functioning, or I'm using the wrong implementation for what I want. I've spent too much time now trying to figure this out alone and would appreciate any solution for this which would work. Thanks in advance!
Seems like you are going a complicated route when a simple delegation of tasks and the usage of the special variable hostvars, to be able to fetch fact from different node, should give you what you expect.
Here is an example — focusing just on the important part — so, you might want to add the become and become_user back in there:
- shell: >-
sudo /usr/bin/mysql mysql -e "show master status \G"
| grep -E 'File:|Position:'
| cut -d{{':'}} -f2
| awk '{print $1}'
delegate_to: mux-ds2
run_once: true
register: syncds2
- shell: >-
sudo /usr/bin/mysql mysql -e "stop slave;
change master to
master_log_file='"{{ hostvars['mux-ds2'].syncds2.stdout_lines.0 }}"',
master_log_pos="{{ hostvars['mux-ds2'].syncds2.stdout_lines.1 }}";
start slave"
delegate_to: mux-ds1
run_once: true
Here is an example, running some dummy shell tasks, given the playbook:
- hosts: node1, node2
gather_facts: no
tasks:
- shell: |
echo 'line 0'
echo 'line 1'
delegate_to: node2
run_once: true
register: master_config
- shell: |
echo '{{ hostvars.node2.master_config.stdout_lines.0 }}'
echo '{{ hostvars.node2.master_config.stdout_lines.1 }}'
delegate_to: node1
run_once: true
register: master_replicate_config
- debug:
var: master_replicate_config.stdout_lines
delegate_to: node1
run_once: true
This would yield:
PLAY [node1, node2] **********************************************************
TASK [shell] *****************************************************************
changed: [node1 -> node2(None)]
TASK [shell] *****************************************************************
changed: [node1]
TASK [debug] *****************************************************************
ok: [node1] =>
master_replicate_config.stdout_lines:
- line 0
- line 1

Skip task if user can't sudo

I am trying to run a playbook with these tasks on a few thousand servers
- name: Check root login config
shell: "egrep -i '^PermitRootLogin' /etc/ssh/sshd_config|awk '{print $2}'"
register: config_value
async : 3
become: yes
poll: 1
- name: "config value"
debug: msg="{{ inventory_hostname }} - {{ config_value.stdout }}"
They have slightly varied configs but this should work on most of them. While running it ,ansible gets stuck somewhere in the middle on some hosts where my user doesn't have passwordless sudo or sudo privileges.
I want to skip those servers where this doesn't work.Is there a way to do that ?
ansible-playbook -i hosts playbook.yml --ask-become-pass
I tried giving a wrong password too ,but it still hangs.
Ansible continues with the rest of the hosts if one task fails on one or more hosts. You could use that behaviour by provoking it before the actual tasks. Don't set become in the playbook, do this instead:
- name: Ping or fail host
become: true
ping:
- name: Check root login config
become: true
shell: "egrep -i '^PermitRootLogin' /etc/ssh/sshd_config|awk '{print $2}'"
register: config_value
async : 3
become: yes
poll: 1
- name: "config value"
debug: msg="{{ inventory_hostname }} - {{ config_value.stdout }}"

Ansible - Multiple/ Alternative hostnames for the same host

Assume I have hosts with multiple (DNS) names/IPs, e.g. because they have multiple NICs and thus routes to reach them.
I want to play a playbook in case one of these routes fails. Since I do not know which one works, I would like ansible to try all of them and then play the book only once for this host. It would be easy to put all the host's names into the inventory and let it run, but then the playbook would be executed once for each name of the host.
Question: Is there a way to specify alternative host names or to tell ansible to run the playbook only on one host per group?
It can be implemented
to run the playbook only on one host per group
See example below.
- hosts: jails-01
strategy: linear
vars:
lock_file: /var/lock/my_ansible_hostname.lock
tasks:
- name: delete lock_file
file:
path: "{{ lock_file }}"
state: absent
run_once: true
delegate_to: localhost
- name: select host
shell: "echo {{ ansible_hostname }} > {{ lock_file }}"
args:
creates: "{{ lock_file }}"
delegate_to: localhost
- name: winner takes it all
fail:
msg: "Too late. Other thread is running. End of play."
when: lookup('file', lock_file) != ansible_hostname
- name: go ahead
debug:
msg: "{{ ansible_hostname }} goes ahead ... "
# ansible-playbook playbook.yml | grep msg
fatal: [test_01]: FAILED! => {"changed": false, "msg": "Too late. Other thread is running. End of play."}
fatal: [test_03]: FAILED! => {"changed": false, "msg": "Too late. Other thread is running. End of play."}
"msg": "test_02 goes ahead ... "

Ansible task for checking that a host is really offline after shutdown

I am using the following Ansible playbook to shut down a list of remote Ubuntu hosts all at once:
- hosts: my_hosts
become: yes
remote_user: my_user
tasks:
- name: Confirm shutdown
pause:
prompt: >-
Do you really want to shutdown machine(s) "{{play_hosts}}"? Press
Enter to continue or Ctrl+C, then A, then Enter to abort ...
- name: Cancel existing shutdown calls
command: /sbin/shutdown -c
ignore_errors: yes
- name: Shutdown machine
command: /sbin/shutdown -h now
Two questions on this:
Is there any module available which can handle the shutdown in a more elegant way than having to run two custom commands?
Is there any way to check that the machines are really down? Or is it an anti-pattern to check this from the same playbook?
I tried something with the net_ping module but I am not sure if this is its real purpose:
- name: Check that machine is down
become: no
net_ping:
dest: "{{ ansible_host }}"
count: 5
state: absent
This, however, fails with
FAILED! => {"changed": false, "msg": "invalid connection specified, expected connection=local, got ssh"}
In more restricted environments, where ping messages are blocked you can listen on ssh port until it goes down. In my case I have set timeout to 60 seconds.
- name: Save target host IP
set_fact:
target_host: "{{ ansible_host }}"
- name: wait for ssh to stop
wait_for: "port=22 host={{ target_host }} delay=10 state=stopped timeout=60"
delegate_to: 127.0.0.1
There is no shutdown module. You can use single fire-and-forget call:
- name: Shutdown server
become: yes
shell: sleep 2 && /sbin/shutdown -c && /sbin/shutdown -h now
async: 1
poll: 0
As for net_ping, it is for network appliances such as switches and routers. If you rely on ICMP messages to test shutdown process, you can use something like this:
- name: Store actual host to be used with local_action
set_fact:
original_host: "{{ ansible_host }}"
- name: Wait for ping loss
local_action: shell ping -q -c 1 -W 1 {{ original_host }}
register: res
retries: 5
until: ('100.0% packet loss' in res.stdout)
failed_when: ('100.0% packet loss' not in res.stdout)
changed_when: no
This will wait for 100% packet loss or fail after 5 retries.
Here you want to use local_action because otherwise commands are executed on remote host (which is supposed to be down).
And you want to use trick to store ansible_host into temp fact, because ansible_host is replaced with 127.0.0.1 when delegated to local host.

Resources