Hope you can help work out why my playbook isn't completing as expected.
ENVIRONMENT
OSX El Capitan
ansible 2.1.0.0
CONFIGURATION
Nothing exciting:
[defaults]
roles_path=./roles
host_key_checking = False
ssh_args= -t -t
allow_world_readable_tmpfiles = True
PLAYBOOK
I have a reasonably involved setup with a number of plays in one playbook.
The playbook is run against different target systems; the production site and a dev rig. (Please don't suggest I combine them... it's an IoT system and complex enough as it is.)
Here's my somewhat redacted playbook:
- hosts: all
roles:
- ...
- hosts: xmpp_server
roles:
- ...
- hosts: audit_server
roles:
- ...
- hosts: elk_server
roles:
- ...
- hosts: all
roles:
- ...
Now, please bear in mind that I have an IoT setup with various redundancies, replication and distribution going on, so although there are other ways of skinning the cat, the above decomposition into multiple plays is really neat for my setup and I'd like to keep it.
Also important: I have no audit_server or elk_server hosts on my dev rig. Those groups are currently empty as I'm working on an orthogonal issue and don't need them consuming limited dev resources. I do have those in production, just not in dev.
EXPECTED BEHAVIOUR
On the production site I expect all the plays to trigger and run.
On the dev rig I expect the first play (all) and the xmpp_server play to run, the audit_server and elk_server plays to skip and the last (all) play to run after that.
ACTUAL BEHAVIOUR
The production site works exactly as expected. All plays run.
The dev rig completes the xmpp_server play as dev-piA is a member of the xmpp_server group. And then it silently stops. No error, no information, nothing. Just straight to the play recap. Here's the output:
...
TASK [xmppserver : include] ****************************************************
included: /Users/al/Studio/Projects/smc/ansible/roles/xmppserver/tasks/./openfire.yml for dev-piA
TASK [xmppserver : Get openfire deb file] **************************************
ok: [dev-piA]
TASK [xmppserver : Install openfire deb file] **********************************
ok: [dev-piA]
TASK [xmppserver : Check if schema has been uploaded previously] ***************
ok: [dev-piA]
TASK [xmppserver : Install openfire schema to postgres db] *********************
skipping: [dev-piA]
to retry, use: --limit #fel.retry
PLAY RECAP *********************************************************************
dev-vagrant1 : ok=0 changed=0 unreachable=1 failed=0
dev-piA : ok=106 changed=3 unreachable=0 failed=0
dev-piB : ok=77 changed=3 unreachable=0 failed=0
dev-piC : ok=77 changed=3 unreachable=0 failed=0
...
So, I ran it with -vvvvv and got nothing more useful:
...
TASK [xmppserver : Install openfire schema to postgres db] *********************
task path: /Users/al/Studio/Projects/smc/ansible/roles/xmppserver/tasks/openfire.yml:14
skipping: [dev-piA] => {"changed": false, "skip_reason": "Conditional check failed", "skipped": true}
to retry, use: --limit #fel.retry
PLAY RECAP *********************************************************************
dev-vagrant1 : ok=0 changed=0 unreachable=1 failed=0
dev-piA : ok=106 changed=2 unreachable=0 failed=0
dev-piB : ok=77 changed=3 unreachable=0 failed=0
dev-piC : ok=77 changed=3 unreachable=0 failed=0
...
HELP NEEDED
So, my question is: why does the playbook just stop there? What's going on?!
It doesn't actually explicitly say that there are no more hosts left for the audit_server play; that's my best guess. It just stops as if it hit an EOF.
I'm completely stumped.
Edit: NB: The retry file only contains a reference to the vagrant machine, which is currently off. But if the existence of that is the problem then Ansible's logic is very flawed. I'll check now just in case anyway
Edit: OMFG it actually IS the missing vagrant box, which has nothing to do with a goddamn thing. That's shocking and I'll raise it as an issue with Ansible. But... I'll leave this here in case anyone ever has the same problem and googles it.
Edit: For clarity, the vagrant machine is not in the host lists for any of the plays, except the special 'all' case.
Ansible aborts execution if every host in the play is unhealthy.
If dev-vagrant1 is the only member of audit_server group, this is the expected behavior (as we see dev-vagrant1 is marked as unreachable).
Nevertheless there should be a line PLAY [audit_server] ******** just before to retry, use...
Ansible folk got back to me and confirmed that they'd been working on a number of issues in this area for the 2.1.1 release.
I updated to 2.1.1.0 and it worked fine.
Related
As testing a simple Ansible playbook
---
- hosts: mikrotiks
connection: network_cli
gather_facts: no
vars:
ansible_network_os: routeros
ansible_user: admin
tasks:
- name: Add Basic FW Rules
routeros_command:
commands:
- /ip firewall nat add chain=srcnat out-interface=ether1 action=masquerade
on my mikrotik router, I used the command with --check argument
ansible-playbook -i hosts mikrotik.yml --check
but it seems that tasks actually got executed.
PLAY [mikrotiks] **************************************************************************************************************************************
TASK [Add Basic FW Rules] **************************************************************************************************************************************
changed: [192.168.1.82]
PLAY RECAP **************************************************************************************************************************************
192.168.1.82 : ok=1 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
ansible.cfg file is the default configuration after fresh install.
According the documentation command module – Run commands on remote devices running MikroTik RouterOS
The module always indicates a (changed) status. You can use the changed_when task property to determine whether a command task actually resulted in a change or not.
Since the module is part of the community.routeros collection I had a short look into the source there and found that is supporting check_mode according
module = AnsibleModule(argument_spec=argument_spec,
supports_check_mode=True)
So you will need to follow up with defining "changed".
I have a lab that consists of an Ansible Tower system and Ubuntu Desktop client. I've successfuly created and executed some playbooks to update and install packages and everythig was OK. Now i want to fetch /var/log/syslog from remote Ubuntu desktop to my Ansible Tower system. My playbook is:
---
- hosts: Ubuntu_18.04_Desktops
tasks:
- name: Get /var/log/syslog
fetch:
src: /var/log/syslog
dest: /tmp
Running this playbook shows the result:
PLAY [Ubuntu_18.04_Desktops] ***************************************************
TASK [Gathering Facts] *********************************************************
ok: [192.168.1.165]
TASK [Get /var/log/syslog] *****************************************************
changed: [192.168.1.165]
PLAY RECAP *********************************************************************
192.168.1.165 : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
But no file is present at /tmp directory of Tower server.
I've tried to use 'flat' directive and to save file to my home's folder, but no success.
I found the problem - Ansible Tower (AWX in my case) stores fetched files in ansible/awx_task container's filesystem.
Ansible Tower's Job Isolation system hides certain paths from you and redirects them to a safe location.
If you do want to use the system's /tmp, you can open Tower Settings -> Jobs -> add /tmp to paths to expose to isolated jobs.
Note that if you need the security to not expose /tmp to all Tower jobs, you should not do this.
I'm trying to run an ansible playbook against multiple hosts that are running containers using the same name. There are 3 hosts each running a container called "web". I'm trying to use the docker connection.
I'm using the typical pattern of a hosts file which works fine for running ansible modules on the host.
- name: Ping
ping:
- name: Add web container to inventory
add_host:
name: web
ansible_connection: docker
ansible_docker_extra_args: "-H=tcp://{{ ansible_host }}:2375"
ansible_user: root
changed_when: false
- name: Remove old logging directory
delegate_to: web
file:
state: absent
path: /var/log/old_logs
It only works against the first host in the hosts file
PLAY [all]
TASK [Gathering Facts]
ok: [web1]
ok: [web2]
ok: [web3]
TASK [web-playbook : Ping]
ok: [web1]
ok: [web2]
ok: [web3]
TASK [web-playbook : Add sensor container to inventory]
ok: [web1]
PLAY RECAP
web1 : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
web2 : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
web3 : ok=4 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
I've tried setting name to web_{{ ansible_host }} to make it unique between hosts but it then tries to connect to web_web1. I've been running the commands using sudo docker exec web rm -rf /var/log/old_logs which of course works, but I'd like to be able to use the ansible modules directly in the docker containers.
The result you get is absolutely expected. Quoting the add_host documentation
This module bypasses the play host loop and only runs once for all the hosts in the play, if you need it to iterate use a with-loop construct.
i.e. you cannot rely on the hosts loop for the add_host and need to make a loop yourself.
Moreover, you definitely need to have different names (i.e. inventory_hostname) for your dynamically created hosts but since all your docker containers have the same name, their ansible_host should be the same.
Assuming all your docker host machines are in the group dockerhosts, the following playbook should do the job. I'm currently not in a situation where I can test this myself so you may have to adjust a bit. Let me know if it helped you and if I have to edit my answer.
Note that even though the add_host task will not naturally loop, I kept the hosts on your original group in the first play so that facts are gathered correctly and correctly populated in the hostvars magic variable
---
- name: Create dynamic inventory for docker containers
hosts: dockerhosts
tasks:
- name: Add web container to inventory
add_host:
name: "web_{{ item }}"
groups:
- dockercontainers
ansible_connection: docker
ansible_host: web
ansible_docker_extra_args: "-H=tcp://{{ hostvars[item].ansible_host }}:2375"
ansible_user: root
loop: "{{ groups['dockerhosts'] }}"
- name: Play needed commands on containers
hosts: dockercontainers
tasks:
- name: Remove old logging directory
file:
state: absent
path: /var/log/old_logs
- hosts: Ebonding
become: yes
become_method: sudo
tasks
- name: Clearing cache of Server4
file: path=/weblogic/bea/user_projects/domains/tmp state=absent
become: yes
become_user: wls10
Ansible version 2.0.0.0 run the above playbook successfully::
PLAY ***************************************************************************
TASK [setup] *******************************************************************
ok: [ggnqinfa2]
TASK [Clearing cache of Server4] ***********************************************
ok: [ggnqinfa2]
PLAY RECAP *********************************************************************
ggnqinfa2 : ok=2 changed=0 unreachable=0 failed=0
But latest version of ansible 2.5.0rc2 encountered below error::
PLAY [Ebonding] *****************************************************************************************************************************************************
TASK [Gathering Facts] **********************************************************************************************************************************************
ok: [ggnqinfa2]
TASK [Clearing cache of Server4] ************************************************************************************************************************************
fatal: [ggnqinfa2]: FAILED! => {"msg": "Failed to set permissions on the temporary files Ansible needs to create when becoming an unprivileged user (rc: 2, err: chown: /var/tmp/ansible-tmp-1520704924.34-191458796685785/: Not owner\nchown: /var/tmp/ansible-tmp-1520704924.34-191458796685785/file.py: Not owner\n}). For information on working around this, see https://docs.ansible.com/ansible/become.html#becoming-an-unprivileged-user"}
PLAY RECAP **********************************************************************************************************************************************************
ggnqinfa2 : ok=1 changed=0 unreachable=0 failed=1
How can i run this playbook by latest version of ansible successfully?
Chances are the user you're using (wls10) does not have write access to the remote temporary directory /var/tmp.
This can be overridden using ansible.cfg and set via remote_tmp to a directory you have write-access to -- or, a "normal temp directory" (like /tmp) that has the sticky bit set.
For more info, see
http://docs.ansible.com/ansible/latest/intro_configuration.html#remote-tmp
Senario:
1. I need to run two plays in a single playbook.
2. The second play should run after the first play.
3. The first play create few instance and update the inventory file by making new group.
4. Second play uses the updated group and install few packages.
Problem: If I am running both plays separately it is success.
But, i need them in same scripts.
The problem i think is both play executing in parallel.
And thanks in advance.
---
- name: ec2
hosts: localhost
connection: local
roles:
- launchEc2
- hosts: ansible
gather_facts: Fasle
become: yes
roles:
- python
OUTPUT:
PLAY [ec2] *********************************************************************
TASK [setup] *******************************************************************
ok: [127.0.0.1]
TASK [launchEc2 : include_vars] ************************************************
ok: [127.0.0.1]
TASK [launchEc2 : Launch new ec2 instance] *************************************
changed: [127.0.0.1]
TASK [launchEc2 : Add ec2 ip to the hostgroup] *********************************
changed: [127.0.0.1] => (item={.....})
TASK [launchEc2 : wait for SSh to come up] *************************************
ok: [127.0.0.1] => (item={.....})
PLAY [ansible] *****************************************************************
TASK [python : install python] *************************************************
skipping: [34.203.228.19]
PLAY RECAP *********************************************************************
127.0.0.1 : ok=5 changed=2 unreachable=0 failed=0
34.203.228.19 : ok=0 changed=0 unreachable=0 failed=0
Ansible loads inventory before processing playbook.
In your case the second play has the same inventory as it was before modification in the first play.
Generally when you provision cloud hosts you may want to use add_host to dynamically add new hosts to in memory inventory, so they are available to subsequent plays.
You may also try to call meta: refresh_inventory after your inventory modification, but I'm not sure wether it work with updating static inventory.