ansible rolling restart playboook - ansible

Folks,
I'd like to have a service be restarted individually on each host, and wait for user input before continuing onto the next host in the inventory.
Currently, if you have the following:
- name: Restart something
command: service foo restart
tags:
- foo
- name: wait
pause: prompt="Make sure org.foo.FooOverload exception is not present"
tags:
- foo
It will only prompt once, and not really have the effect desired.
What is the proper ansible syntax to wait for user input before running the restart task on each host?

Use a combination of serial attribute and step option of a playbook.
playbook.yml
- name: Do it
hosts: myhosts
serial: 1
tasks:
- shell: hostname
Call the playbook with --step option
ansible-playbook playbook.yml --step
You will be prompted for every host.
Perform task: shell hostname (y/n/c): y
Perform task: shell hostname (y/n/c): ****************************************
changed: [y.y.y.y]
Perform task: shell hostname (y/n/c): y
Perform task: shell hostname (y/n/c): ****************************************
changed: [z.z.z.z]
For more information: Start and Step

I went ahead with this:
- name: Restart Datastax Agent
tags:
- agent
hosts: cassandra
sudo: yes
serial: 1
gather_facts: yes
tasks:
- name: Pause
pause: prompt="Hit RETURN to restart datastax agent on {{ inventory_hostname }}"
- name: Restarting Datastax Agent on {{ inventory_hostname }}
service: name=datastax-agent state=restarted

Related

Run ansible playbook in role with specific playbook

I need run one playbook in main playbook with another hosts file e.g:
- hosts: starter
become: yes
tasks:
- name: start playbook2
shell: |
ansible-playbook -b -i host-cluster.txt install-cluster.yaml
args:
executable: /bin/bash
it's ok and run my playbook, but I know this structure is not correct! also, I need when playbook2 started I see result palybook2 in terminal, but I only see
TASK [install-cluster : Run task1 ] *******************************
I want to see result task1 in terminal.
Update:
I need run one role with specific file ( install-cluster.yaml) and with specific inventory hosts file (host-cluster.txt).
something like this:
- name: start kuber cluster
include_role:
name:kuber
tasks_from: cluster.yml
hosts: kuber-hosts.txt
You can load several inventories at once and then target different group(s)/host(s) in different plays. Just as an idea, you could have the following pseudo your_playbook.yml:
- name: Play1 does some stuff on starter hosts
hosts: starter
become: true
tasks:
- name: do a stuff on starter servers
debug:
msg: "I did a thing"
- name: Play2 starts cluster on kuber hosts
hosts: kuber
tasks:
- name: start kuber cluster
include_role:
name: kuber
tasks_from: cluster.yml
- name: Play3 does more stuff on starter hosts
hosts: starter
tasks:
- name: do more stuff on starter servers
debug:
msg: "I did more things"
You can then use this playbook with two different inventories at once if this is how you architectured it:
ansible-playbook -i inventories/starter -i inventories/kuber your_playbook.yml

Ansible run delegate_to task on remote machine as different user

I want to set a cron entry on a remote host, but connecting to the host as a different user.
# task
- name: Cron to ls at a specific time
cron:
name: "perform a listing"
weekday: "6"
minute: "5"
hour: "3"
job: "/bin/ls -lR /mnt/*/"
delegate_to: "{{ my_remote_machine }}"
Problem
This is a startup script on an instance in the cloud.
The script runs as root, there fore will try to connect to {{ my_remote_machine }} as root.
root is obviously disabled by default on most cloud instances.
Because of this, I can't use the become_user keyword.
Do I have any other options?
Simply change the remote_user for the given task to the one you can connect with on the delegated host. Here is a pseudo playbook to give you the basics.
Note: if targeting a host using ansible_connection: local (e.g. default implicit localhost), remote_user is ignored and defaults to the user launching the playbook on the controller.
---
- name: Play mixing several hosts and users
hosts: some_host_or_group
# Play level remote_user. In short, this is used if not overridden in task.
# See documentation for finer grained info (define in inventory, etc...)
remote_user: root
tasks:
- name: Check who we are on current host
command: id -a
register: who_we_are_current
- debug:
var: who_we_are_current.stdout
- name: Show we can be someone else on delegate
command: id -a
# Task level remote_user: overrides play
remote_user: johnd
delegate_to: "{{ my_remote_machine }}"
register: who_whe_are_delegate
- debug:
var: who_whe_are_delegate.stdout
- name: And of course, this works with your real task as well
cron:
name: "perform a listing"
weekday: "6"
minute: "5"
hour: "3"
job: "/bin/ls -lR /mnt/*/"
remote_user: johnd
delegate_to: "{{ my_remote_machine }}"

Stopping ansible from a playbook (site.yml)

I have re-written the question to be more specific instead of using a generic example of what I am trying to achieve, as per #Zeitounator's suggestion.
I use ansible to spin up VM's in VMware by adding a new entry in the hosts.ini file and running ansible-playbook -i inventory/dev/hosts.ini --limit SomeGroup playbooks/site.yml
The vmware role (calledvmware) will
* check to see if the VM already exists.
* If it does, then obviously it does not create the VM.
* If it does not exist, then it will create the VM from a template.
To destroy a VM, I run this: ansible-playbook -i inventory/dev/hosts.ini --limit SomeGroup playbooks/site.yml -e 'vmware_destroy=true'
That works as intended. Now for my issue.
When this variable is set (vmware_destroy=true), it will destroy the VM successfully, BUT ansible will attempt to carry on with the rest of the playbook on the host that has just been destroyed. Obviously it fails. The playbook does actually stop due to the failure. But not gracefully.
Here is an example of the flow of logic:
$ cat playbooks/site.yml
---
- hosts: all
gather_facts: no
roles:
- { role: vmware, tags: vmware }
- hosts: all
gather_facts: yes
roles:
- { role: bootstrap, tags: bootstrap }
- { role: common, tags: common }
- hosts: AppServers
gather_facts: no
roles:
- { role: application }
# and so on.
$ cat playbooks/roles/vmware/main.yml
---
# Checks to see if the VM exists already.
# A variable `found_vm` is registered in this task.
- import_tasks: find.yml
# Only import this task when all of the `when` conditions are met.
- import_tasks: destroy.yml
when:
- vmware_destroy is defined
- vmware_destroy # Meaning 'True'
- found_vm
# If the above is true, it will not import this task.
- import_tasks: create.yml
when:
- found_vm.failed
- vmware_destroy is not defined
So, the point is, when I specify -e 'vmware_destroy=true', ansible will attempt to run the rest of the playbook and fail.
I don't want ansible to fail. I want it to stop gracefully after completing the vmware role based on -e 'vmware_destroy=true flag provided on the command line.
I am aware I can use a different playbook for this, something like:
ansible-playbook -i inventory/dev/hosts.ini --limit SomeGroup playbooks/VMWARE_DESTROY.yml. But I would rather use a variable as opposed to a separate playbook. If there is a strong argument to split out the playbook in this way, please provide references.
Please let me know if more clarification is needed.
Thank you in advance.
A playbook is the top Ansible abstraction layer (playbook -> role -> task -> ...). There are 2 more layers in AWX (workflow -> job template -> playbook ...) to control the playbooks. To follow the architecture either AWX, or any other tool to interface Ansible (ansible-runner, or scripts for example) should be used to control the playbooks.
Controlling of the playbooks inside Ansible is rather awkward. Create 2 playbooks vmware-create.yml and vmware-destroy.yml
$ cat vmware-create.yml
- hosts: all
gather_facts: yes # include cached variables
tasks:
- block:
- debug:
msg: Don't create VM this time. End of play.
- meta: end_play
when: not hostvars['localhost'].vmware_create
- debug:
msg: Create VM.
$ cat vmware-destroy.yml
- hosts: all
gather_facts: yes # include cached variables
tasks:
- block:
- debug:
msg: Don't destroy VM this time. End of play.
- meta: end_play
when: not hostvars['localhost'].vmware_destroy
- debug:
msg: Destroy VM.
and import them into the playbook vmware_control.yml. See below
$ cat vmware-control.yml
- hosts: localhost
vars:
vmware_create: true
vmware_destroy: false
tasks:
- set_fact:
vmware_create: "{{ vmware_create }}" # cache variable
- set_fact:
vmware_destroy: "{{ vmware_destroy }}" # cache variable
- import_playbook: vmware-create.yml
- import_playbook: vmware-destroy.yml
Control the flow with the variables vmware_create and vmware_destroy. Run vmware_control.yml at localhost and declare hosts: all inside vmware-create.yml and vmware-destroy.yml.

Unable to switch to a different user using become_user

I am writing Ansible playbook to create key-based ssh access on several hosts with a particular user.
I have following servers:
automation_host
Master
Slave1
Slave2
From automation host I will trigger Ansible to run the playbook which should first login to master with user1, then switch to user2, create ssh keys with user2 and copy the id_rsa.pub to slave nodes.
Inventory file contents:
[master]
172.xxx.xxx.xxx
[slaves]
172.xxx.xxx.xxx
172.xxx.xxx.xxx
[all:vars]
ansible_connection=ssh
ansible_ssh_user=user1
playbook.yml file:
- hosts: master
become_user: user2
become: yes
roles:
- name: passwordless-ssh
User2 is available on all hosts (except automation_host) and is added in sudoers as well.
In the passwordless-ssh role, I have added the lines included below to check which user is currently executing the tasks.
- name: get the username running the deploy
local_action: command whoami
register: username_on_the_host
- debug: var=username_on_the_host
Debug message shows user1 ( I am expecting it to be user2)
ansible version: 2.5.2
I am very new to Ansible.
local_action will run on automation_host, change it to command
- hosts: master
become_user: user2
become: yes
tasks:
- name: get the username running the deploy
command: whoami
register: username_on_the_host
- debug: var=username_on_the_host['stdout']
- name: do something
command: echo 'hello'
when: username_on_the_host['stdout'] == 'user2'
- name: do something else
command: echo 'goodby'
when: username_on_the_host['stdout'] == 'user1'
Output
TASK [debug] *********************************************
ok: [master] => {
"username_on_the_host['stdout']": "user2"
}
TASK [do something] *********************************************
changed: [master]
TASK [do something else] *********************************************
do something else does not run.

Ansible task for checking that a host is really offline after shutdown

I am using the following Ansible playbook to shut down a list of remote Ubuntu hosts all at once:
- hosts: my_hosts
become: yes
remote_user: my_user
tasks:
- name: Confirm shutdown
pause:
prompt: >-
Do you really want to shutdown machine(s) "{{play_hosts}}"? Press
Enter to continue or Ctrl+C, then A, then Enter to abort ...
- name: Cancel existing shutdown calls
command: /sbin/shutdown -c
ignore_errors: yes
- name: Shutdown machine
command: /sbin/shutdown -h now
Two questions on this:
Is there any module available which can handle the shutdown in a more elegant way than having to run two custom commands?
Is there any way to check that the machines are really down? Or is it an anti-pattern to check this from the same playbook?
I tried something with the net_ping module but I am not sure if this is its real purpose:
- name: Check that machine is down
become: no
net_ping:
dest: "{{ ansible_host }}"
count: 5
state: absent
This, however, fails with
FAILED! => {"changed": false, "msg": "invalid connection specified, expected connection=local, got ssh"}
In more restricted environments, where ping messages are blocked you can listen on ssh port until it goes down. In my case I have set timeout to 60 seconds.
- name: Save target host IP
set_fact:
target_host: "{{ ansible_host }}"
- name: wait for ssh to stop
wait_for: "port=22 host={{ target_host }} delay=10 state=stopped timeout=60"
delegate_to: 127.0.0.1
There is no shutdown module. You can use single fire-and-forget call:
- name: Shutdown server
become: yes
shell: sleep 2 && /sbin/shutdown -c && /sbin/shutdown -h now
async: 1
poll: 0
As for net_ping, it is for network appliances such as switches and routers. If you rely on ICMP messages to test shutdown process, you can use something like this:
- name: Store actual host to be used with local_action
set_fact:
original_host: "{{ ansible_host }}"
- name: Wait for ping loss
local_action: shell ping -q -c 1 -W 1 {{ original_host }}
register: res
retries: 5
until: ('100.0% packet loss' in res.stdout)
failed_when: ('100.0% packet loss' not in res.stdout)
changed_when: no
This will wait for 100% packet loss or fail after 5 retries.
Here you want to use local_action because otherwise commands are executed on remote host (which is supposed to be down).
And you want to use trick to store ansible_host into temp fact, because ansible_host is replaced with 127.0.0.1 when delegated to local host.

Resources