Ansible How to replay notifications - provisioning

Currently I am switching from puppet to Ansible and I am a bit confused with some concepts or at least how ansible works.
Some info on the setup:
I am using the examples from Ansible Best Practices and have structured my project similar with several roles (playbooks) and so on.
I am using Vagrant for provisioning and the box is Saucy64 VBox.
Where the Confusion comes:
When I provision, and I run ansible, tasks start to execute, then the stack of notifications.
Example:
Last task:
TASK: [mysql | delete anonymous MySQL server user for localhost] **************
<127.0.0.1> REMOTE_MODULE mysql_user user='' state=absent
changed: [default] => {"changed": true, "item": "", "user": ""}
Then first notification:
NOTIFIED: [timezone | update tzdata] ******************************************
<127.0.0.1> REMOTE_MODULE command /usr/sbin/dpkg-reconfigure --frontend noninteractive tzdata
changed: [default] => {"changed": true, "cmd": ["/usr/sbin/dpkg-reconfigure", "--frontend", "noninteractive", "tzdata"], "delta": "0:00:00.224081", "end": "2014-02-03 22:34:48.508961", "item": "", "rc": 0, "start": "2014-02-03 22:34:48.284880", "stderr": "\nCurrent default time zone: 'Europe/Amsterdam'\nLocal time is now: Mon Feb 3 22:34:48 CET 2014.\nUniversal Time is now: Mon Feb 3 21:34:48 UTC 2014.", "stdout": ""}
Now this is all fine. As the roles increase more and more notifications stuck up.
Now here comes the problem.
When a notification fails the provisioning stops as usual. But then the notification stack is empty!
This means that all notifications that where after the faulty one will not be executed!
If that is so then if you changed a vhosts setting for apache and had a notification for the apache service to reload then this would get lost.
Let's give an example (pseudo lang):
- name: Install Apache Modules
notify: Restart Apache
- name: Enable Vhosts
notify: Reload Apache
- name: Install PHP
command: GGGGGG # throws an error
When the above executes:
Apache modules are installed
Vhosts are enables
PHP tries to istall and fails
Script exits
(Where are the notifications?)
Now at this point all seems logical but again Ansible tries to be clever (no!*) stacks notifications and thus reload and restart apache will result in a single restart of apache run at the end of provisioning. That means that all notifications will fail!!!
Now up to here for some people this is fine as well. They will say hey just re-run the provisioning and the notifications will fire up, thus apache will be finally reloaded and site will be up again. This is not the case.
On the second run of the script after the code for installing php is corrected the notifications will not run due to design. Why?
This is why:
Ansible will have the tasks that executed successfully, marked as "Done/Green" thus not registering any notifications for these tasks. The provisioning will be successful and in order to trigger the notification and thus the apache restart you can do one of the following:
Run a direct command to the server via ansible or ssh
Edit the script to trigger the task
Add a separate task for that
Destroy instance of box and reprovision
This is quite frustrating because requires total cleanup of the box, or do I not understand something correctly with Ansible?
Is there another way to 'reclaim'/replay/force the notifications to execute?
Clever would be either to mark the task as incomplete and then restart the notifications or keep a separate queue with the notifications as tasks of their own.*

Yeah, that's one of the shortcomings of Ansible to say compared to Puppet. Puppet is declarative and doesn't error out like Ansible (or Chef) for that matter. It has its positives and negatives, for example Puppet takes a little bit of time before it starts running because it needs to compile its catalog.
So, you are right if your Ansible script errors out then your notification updates won't happen. The only way we've gotten around it is by using conditional statements. In your playbook you can do something like this:
- name: My cool playbook
hosts: all
vars:
force_tasks: 0
tasks:
- name: Apache install
action: apt pkg=$item state=latest
with_items:
- apache2
- apache2-mpm-prefork
- name: Restart apache
action: service name=apache2 state=restart
when: force_tasks
Then when you run your playbook you can pass force_tasks as an environment variable:
ansible-playbook -i my_inventory -e "force_tasks=True" my_ansible_playbook.yml
You can accomplish this in similar fashion with tags.

Run ansible-playbook with the --force-handlers flag. This tells Ansible to run any queued handlers even if a task fails and further processing stops. The Ansible developers plan to add this as an option to the ansible.cfg file so it can be set globally and forgotten about. I don't know what the time frame for that is.

Related

Running a task in Ansible if there was any error during the playbook

I have a huge playbook with a lot of tasks in it:
---
- hosts: localhost
vars_files:
- mario-configuration.yml
post_tasks:
# After the playbook run, tell it's over
- name: It's a me, Mario!
shell: 'say -v Luca "Itsami, Mario!"'
roles:
- aerial # installs appletv like screen saver: https://aerialscreensaver.github.io
- clean-dock # remove all the stickies in the macOS dock, useful after a fresh install
- clean-desktop # hides the files in the desktop (they are still there, just hidden)
- rocket # emoji picker: https://matthewpalmer.net/rocket
# ... more roles
- vlc # video viewer: https://www.videolan.org/vlc
(whole code is open-sourced here: https://github.com/web-id-fr/mario)
This build is running in MacOS, and at the end of it I run the "say" command to tell the user it ended. But if there is an error during the playbook the post_tasks is not run. Is there a simple way in Ansible to run a specific task if any role has failed?
I guess this is not possible (yet) as stated in Error handling in playbooks
When Ansible receives a non-zero return code from a command or a failure from a module, by default it stops executing on that host and continues on other hosts.
There is a feature request for that kind of thing.

Run local command with Ansible and share variable in the remote context

I have the following logic that I would like to implement with Ansible:
Before to update some operating system packages, I want to check some other remote dependencies, which involve querying some endpoints and decide if the next version is good or not.
The script new_version_available returns 0 if there is something new and 1 if there isn't something new.
To avoid install unnecessary packages in production, or open unnecessary ports in my firewall in the DMZ, I would like to run this script locally in my host and if it succeeds, then we run the next task remotely.
tasks:
- name: Check if there is new version available
command: "{{playbook_dir}}/new_version_available"
delegate_to: 127.0.0.1
register: new_version_available
ignore_errors: False
- name: Install our package
command:
cmd: '/usr/bin/our_installer update'
warn: False
when: new_version_available is succeeded
Which gives me the following error:
fatal: [localhost -> 127.0.0.1]: FAILED! => {"changed": false, "cmd": "/home/foo/ansible-deploy-bar/new_version_available", "msg": "[Errno 2] No such file or directory", "rc": 2}
That means that my command cannot be found, however my script exists and i have permission to access it.
My Development environment where I'm testing the playbook, is running in a virtual machine, via NAT, where forward the Guest port 22 to my host 2222, so if i want to login in my VM I do ssh root#localhost -p 2222. My inventory looks like:
foo:
hosts:
localhost:2222
My Question is:
What would be the Ansible way to achieve what I want, i.e run some command locally and pass the results to a register and use it as condition in a task? Run the command and pass the result as environment variable to Ansible?
I'm using this documentation as support https://docs.ansible.com/ansible/latest/user_guide/playbooks_delegation.html

Disable systemd service only if the package is installed

This is what I have
- name: disable os firewall - firewalld
systemd:
name: firewalld
state: stopped
enabled: no
This task works fine on hosts where firewalld package is installed but fails on hosts where it is not.
What is the best simple approach to make sure it can handle hosts which do not have firewalld installed? I do not want to use any CMDB as it's an additional setup. Also, I want the task to be idempotent; using a shell command like dpkg or rpm to query if firewalld is installed makes the ansible playbook summary report changes which I do not want.
Here is an example using failed_when:. While it does run every time, it ignores the failure if the package is not installed:
- name: Stop and disable chrony service
service:
name: chronyd
enabled: no
state: stopped
register: chronyd_service_result
failed_when: "chronyd_service_result is failed and 'Could not find the requested service' not in chronyd_service_result.msg"
What is the best approach to make sure it can handle hosts which do not have firewalld installed?
Get the information on which systems firewalld is installed from a CMDB of any kind (like Ansible vars files, since you already use Ansible), and run the task on those systems.
Likewise, do not run the task on systems that do not have firewalld installed according to the configuration pulled from the CMDB.
The actual implementation vary. Among others: specifying appropriate host groups for plays, using conditional in the task, using limit option for inventory.
In your Ansible play. you can check first if the package is installed before trying to stop/start it.
something similar to this: https://stackoverflow.com/a/46975808/3768721
With shell approach:
- name: Disable {{ service }} if enabled
shell: if systemctl is-enabled --quiet {{ service }}; then systemctl disable {{ service }} && echo disable_ok ; fi
register: output
changed_when: "'disable_ok' in output.stdout"
It produces 3 states:
service is absent or disabled already — ok
service exists and was disabled — changed
service exists and disable failed — failed

ansible: how to restart auditd service on centos 7 get error about dependency

In my playbook, i have a task to update audit.rules and then notify a handler which should restart the auditd service.
task:
- name: 6.6.7 - audit rules configuration
template: src=X/ansible/templates/auditd_rules.j2
dest=/etc/audit/rules.d/audit.rules
backup=yes
owner=root group=root mode=0640
notify:
- restart auditd
handlers:
- name: restart auditd
service: name=auditd state=restarted
When the playbook runs, the audit rules are updated and a request is made to restart auditd but this fails as below.
RUNNING HANDLER [restart auditd] ***********************************************
fatal: [ipX-southeast-2.compute.internal]: FAILED! => {"changed": false, "failed": true, "msg": "Unable to restart service auditd: Failed to restart auditd.service: Operation refused, unit auditd.service may be requested by dependency only.\n"}
When i look at the unit definition for auditd, i can see refuseManualStop=yes. Is this why i cant restart the service? how does one over come this to pickup the new audit rules?
systemctl cat auditd.service
# /usr/lib/systemd/system/auditd.service
[Unit]
Description=Security Auditing Service
DefaultDependencies=no
After=local-fs.target systemd-tmpfiles-setup.service
Conflicts=shutdown.target
Before=sysinit.target shutdown.target
RefuseManualStop=yes
ConditionKernelCommandLine=!audit=0
Documentation=man:auditd(8) https://people.redhat.com/sgrubb/audit/
[Service]
ExecStart=/sbin/auditd -n
## To not use augenrules, copy this file to /etc/systemd/system/auditd.service
## and comment/delete the next line and uncomment the auditctl line.
## NOTE: augenrules expect any rules to be added to /etc/audit/rules.d/
ExecStartPost=-/sbin/augenrules --load
#ExecStartPost=-/sbin/auditctl -R /etc/audit/audit.rules
ExecReload=/bin/kill -HUP $MAINPID
# By default we don't clear the rules on exit. To enable this, uncomment
# the next line after copying the file to /etc/systemd/system/auditd.service
#ExecStopPost=/sbin/auditctl -R /etc/audit/audit-stop.rules
[Install]
WantedBy=multi-user.target
This has been explored, discussed, and resolved (mostly) in the Red Hat Bugzilla #1026648 and Anisble Issue # 22171 (github) reports.
Resolution
Use the ansible service module parameter use=service to force execution of the /sbin/service utility instead of the gathered-fact value of systemd (which invokes /sbin/systemctl) like this:
- service: name=auditd state=restarted use=service
Example playbook (pastebin.com)
Workaround:
Use the ansible command module to explicitly run the service executable like this:
- command: /sbin/service auditd restart
Analysis - root cause:
This is an issue created by upstream packaging of auditd.service unit. It will not start/stop/restart when acted upon by systemctl, apparently by design.
It is further compounded by the Ansible service control function, which uses the preferred method identified when system facts are gathered and "ansible_service_mgr" returns "systemd". This is regardless of the actual module used to manage the service.unit.
RHEL dev team may fix if considered a problem in upcoming updates (ERRATA)
Ansible dev team has offered a workaround and (as of 2.2) updated the service module with the use parameter.
Maybe is quit late for the answer, but if anyone else is facing the same problem, you can import the new rules for auditd with this command :
auditctl -R /path/to_your_rules_file
So, no need to restart auditd.service to import new rules
I would validate the auditd service reloading correctly as even using the command module with the command you specified will not work or behave in the manner you would expect it to;
Confirm via
service auditd status
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Security_Guide/sec-starting_the_audit_service.html
Try instead
service auditd condrestart
:)
You should not change the parameter refuseManualStop, it's there to keep you system secure. What you could do instead, after creating new rules, reboot the host, wait for it, then continue with you playbook.
Playbook example:
- name: Create new rules file
copy:
src: 01-personalized.rules
dest: /etc/audit/rules.d/01-personalized.rules
owner: root
group: root
mode: 0600
register: result
- name: Reboot server
shell: "sleep 5 && reboot"
async: 1
poll: 0
when: result is changed
- name: Wait for server to become available
wait_for_connection:
delay: 60
sleep: 5
timeout: 300
when: result is changed
Change the manual stop to NO and try
sudo service auditd restart
If this works, then the code will also work
systemctl start auditd
and
systemctl enable auditd
is for version CentOS version7.
Follow the links for further help.
Link
Auditd in CentOS7

Ansible handler runs only if changed: true

Installing ntp with Ansible,
I notify handler in order to start ntpd service:
Task:
---
# roles/common/tasks/ntp.yml
- name: ntp | installing
yum: name=ntp state=latest
notify: start ntp
Handler:
---
# roles/common/handlers/main.yml
- name: start ntp
service: name=ntpd state=started
If service has not been installed, ansible installs and starts it.
If service has been installed already, but is not running, it does not notify handler: status of task is changed: false
That means, I cannot start it, if it has been already presented in OS.
Is there any good practice that helps to be sure that service has been installed and is in running state?
PS: I may do so:
---
# roles/common/tasks/ntp.yml
- name: ntp | installing
yum: name=ntp state=latest
notify: start ntp
changed: true
but I am not sure that it is good practice.
From the Intro to Playbooks guide:
As we’ve mentioned, modules are written to be ‘idempotent’ and can relay when they have made a change on the remote system. Playbooks recognize this and have a basic event system that can be used to respond to change.
These ‘notify’ actions are triggered at the end of each block of tasks in a playbook, and will only be triggered once even if notified by multiple different tasks.
Handlers only run on change by design. If you change a configuration you often need to restart a service, but don't want to if nothing has changed.
What you want is to start a service if it is not already running. To do this you should use a regular task as described by #udondan :
- name: ntp | installing
yum:
name: ntp
state: latest
- name: ntp | starting
service:
name: ntpd
state: started
enabled: yes
Ansible is idempotent by design, so this second task will only run if ntp is not already running. The enabled line will set the service to start on boot. Remove this line if that is not desired behavior.
Why don't you just add a service task then? A handler usually is for restarting a service after configuration has changed. To ensure a service is running not matter what, just add at task like that:
- name: Ensure ntp is running
service:
name: ntpd
state: started

Resources