Iterate over inventory facts [duplicate] - ansible

I'm sitting in front of a fairly complex Ansible project that we're using to set up our local development environments (multiple VMs) and there's one role that uses the facts gathered by Ansible to set up the /etc/hosts file on every VM. Unfortunately, when you want to run the playbook for one host only (using the -limit parameter) the facts from the other hosts are (obviously) missing.
Is there a way to force Ansible to gather facts on all hosts, even if you limit the playbook to one specific host?
We tried to add a play to the playbook to gather facts from all hosts, but of course that also gets limited to the one host given by the -limit parameter. If there'd be a way to force this play to run on all hosts before the other plays, that would be perfect.
I've googled a bit and found the solution with fact caching with redis, but since our playbook is used locally, I wanted to avoid the need for additional software. I know, it's not a big deal, but I was just looking for a "cleaner", Ansible-only solution and was wondering, if that would exist.

Ansible version 2 introduced a clean, official way to do this using delegated facts (see: http://docs.ansible.com/ansible/latest/playbooks_delegation.html#delegated-facts).
when: hostvars[item]['ansible_default_ipv4'] is not defined is a check to ensure you don't check for facts in a host you already know the facts about
---
# This play will still work as intended if called with --limit "<host>" or --tags "some_tag"
- name: Hostfile generation
hosts: all
become: true
pre_tasks:
- name: Gather facts from ALL hosts (regardless of limit or tags)
setup:
delegate_to: "{{ item }}"
delegate_facts: True
when: hostvars[item]['ansible_default_ipv4'] is not defined
with_items: "{{ groups['all'] }}"
tasks:
- template:
src: "templates/hosts.j2"
dest: "/etc/hosts"
tags:
- hostfile
...

In general the way to get facts for all hosts even when you don't want to run tasks on all hosts is to do something like this:
- hosts: all
tasks: [ ]
But as you mentioned, the --limit parameter will limit what hosts this would be applied to.
I don't think there's a way to simply tell Ansible to ignore the --limit parameter on any plays. However there may be another way to do what you want entirely within Ansible.
I haven't used it personally, but as of Ansible 1.8 fact caching is available. In a nutshell, with fact caching enabled Ansible will use a redis server to cache all the facts about hosts it encounters and you'll be able to reference them in subsequent playbooks:
With fact caching enabled, it is possible for machine in one group to reference variables about machines in the other group, despite the fact that they have not been communicated with in the current execution of /usr/bin/ansible-playbook.

This still seems to be an issue without a clean solution here in 2016, but newer versions of Ansible offer a "jsonfile" fact caching backend, which seems to be a decent compromise to installing Redis locally just to address this need. Now I just fire off an ansible all -m setup before running a playbook with the --limit option. Good enough for jazz!
http://docs.ansible.com/ansible/playbooks_variables.html#fact-caching

You could modify your playbook to:
...
- hosts: "{{ limit_hosts|default('default_group') }}"
tasks:
...
...
And when you run it, if some_var is not defined (normal state) then it will run on the default_group inventory group, BUT if you run it as:
ansible-playbook --extra-vars "limit_hosts=myHost" myplaybook.yml
Then it will only run on your myHost, but you could still have other sections with different hosts: .. declarations, for fact gathering, or anything else actually.

Related

How to run ansible in serial keeping facts between batches?

I have been trying to run a play in serial, 1 host at a time, to manage a rolling upgrade.
The problem I kept running in to was that when running in serial the facts was not kept between batches, making variables such as ansible_default_ipv4.address fail when asking for that variable for the entire group which the play was running on.
I have not been able to find somehting specific regarding this, but rather stumbled upon similar problems using --limit when running a playbook, so the work around I ended up trying after a bit of tinkering was fact cahing.
More specifically, I let the playbook run two sets of tasks.
First one being specifically to run on all hosts in the group, to gather all facts, and these then cached on file, for inventory as well.
For some reason though, when moving on to the play which was going to run in serial, these facts does not seem to be kept.
Does anyone have any ideas on how to solve or do a work around for this?
I believe it would also work to another related issue, being run_once runs once per batch, whilst it in this case would need to run once per ... run(?).
Not using serial made the play work as expected.
ansible.cfg
[defaults]
fact_caching = jsonfile
fact_caching_connection = /tmp/ansiblecache
fact_caching_timeout = 3600
[inventory]
cache=True
cache_plugin = jsonfile
playbook.yml
---
- name: installation & configuration | gather facts
hosts: cluster
become: yes
gather_facts: true
- name: installation & configuration
hosts: cluster
become: yes
serial: "{{ '1' if upgrade else '100%' }}"
gather_facts: false
roles:
- { role: myrole }

how to set different python interpreters for local and remote hosts

Use-Case:
Playbook 1
when we first connect to a remote host/s, the remote host will already have some python version installed - the auto-discovery feature will find it
now we install ansible-docker on the remote host
from this time on: the ansible-docker docs suggest to use ansible_python_interpreter=/usr/bin/env python-docker
Playbook 2
We connect to the same host/s again, but now we must use the /usr/bin/env python-docker python interpreter
What is the best way to do this?
Currently we set ansible_python_interpreter on the playbook level of Playbook 2:
---
- name: DaqMon app
vars:
- ansible_python_interpreter: "{{ '/usr/bin/env python-docker' }}"
This works, but this will also change the python interpreter of the local actions. And thus the local actions will fail, because (python-docker does not exist locally).
the current workaround is to explicitly specify the ansible_python_interpreter on every local-action which is tedious and error-prone
Questions:
the ideal solution is, if we could add '/usr/bin/env python-docker' as fallback to interpreter-python-fallback - but I think this is not possible
is there a way to set the python interpreter only for the remote hosts - and keep the default for the localhost?
or is it possible to explicitly override the python interpreter for the local host?
You should set the ansible_python_interpreter on the host level.
So yes, it's possible to explicitly set the interpreter for localhost in your inventory.
localhost ansible_connection=local ansible_python_interpreter=/usr/bin/python
And I assume that you could also use set_fact on hostvars[<host>].ansible_python_interpreter on your localhost or docker host.
There is a brillant article about set_fact on hostvars ! ;-P
Thanks to the other useful answers I found an easy solution:
on the playbook level we set the python interpreter to /usr/bin/env python-docker
then we use a set_fact task to override the interpreter for localhost only
we must also delegate the facts
we can use the magic ansible_playbook_python variable, which refers to the python interpreter that was used on the (local) Ansible host to start the playbook: see Ansible docs
Here are the important parts at the start of Playbook 2:
---
- name: Playbook 2
vars:
- ansible_python_interpreter: "{{ '/usr/bin/env python-docker' }}"
...
tasks:
- set_fact:
ansible_python_interpreter: '{{ ansible_playbook_python }}'
delegate_to: localhost
delegate_facts: true
Try to use set_fact for ansible_python_interpreter at host level in the first playbook.
Globally, use the interpreter_python key in the [defaults] section of the ansible.cfg file.
interpreter_python = auto_silent

Ansible - vars are not correctly propagated to handlers when role run in loop

I am asking for help with a problem of deploying multiple versions (different variables) of the app with the same role run from one playbook.
We have an app with multiple product families, which are different code versions. Each version has separate uWSGI vassal config and Nginx virtualhost config(/api/v2, /api/v3, ...).
The desired state would be to run playbook and configure the server with all versions specified.
Sadly, ansible's import_role/import_tasks can't be used with with_items, so include_role/include_tasks must be used (pitty because they do not honor role tags).
The include_role method would not be the biggest problem, but we use handlers to notify uWSGI touch to reload - on a code change, link change, virtualenv change, app_config change, ...).
But when using loop (with_items), the variables passed from the loop does not correctly propagate to handlers.
I tried this scenarios
playbook.yml - with_items loop inside the playbook
PROBLEM: Handler is run only for the first iteration of the loop.
#!/usr/bin/env ansible-playbook
# HAndler is run only once, from first notifier
- hosts: localhost
gather_facts: no
vars:
app_root: "/tmp/test_ansible"
app_versions:
- app_product_family: 1
app_release: "v1.0.2"
- app_product_family: 3
app_release: "v4.0.7"
tasks:
- name: Deploy multiple versions of app
include_role:
name: app
with_items: "{{ app_versions }}"
loop_control:
loop_var: app_version
vars:
app_product_family: "{{ app_version.app_product_family }}"
app_release: "{{ app_version.app_release }}"
tags:
- app
- app_debug
playbook_v2.yml - with_items loop inside role task
PROBLEM: Handler is run with the default value from "Defaults"
#!/usr/bin/env ansible-playbook
- hosts: localhost
gather_facts: no
roles:
- app_v2
vars:
app_v2_root: "/tmp/test_ansible_v2"
app_v2_versions:
- app_v2_product_family: 1
app_v2_release: "v1.0.2"
- app_v2_product_family: 3
app_v2_release: "v4.0.7"
Tasks roles/app_v2/main.yml
---
# Workaround because import_tasks can't be run with_items
- include_tasks: deploy.yml
when: app_v2_versions
with_items: "{{ app_v2_versions }}"
loop_control:
loop_var: app_v2
vars:
app_v2_product_family: "{{ app_v2.app_v2_product_family }}"
app_v2_release: "{{ app_v2.app_v2_release }}"
tags:
- app_v2
- app_v2_deploy
...
One idea was about writing a separate role for each product family, but they share nginx and uWSGI, so it will be lots of copy-pasting and sharing tasks (so tags would not work properly).
For now, I solved it with shell script wrapper, but this is not an ideal solution and does not work from Ansible tower.
Sample repo with tasks to reproduce problem (tested with ansible 2.4, 2.5, 2.6)
Any ideas & recommendations are very welcome.
The order of overrides for variables is broken for includes in Ansible. F.e. even set_fact in the included role will be shadowed by role defaults.
See this bug: https://github.com/ansible/ansible/issues/22025
It's closed but not fixed. My advice: use include and variables really carefully.
In practice I never use role includes with loop. If you need loop, include a tasklist in this loop (and that tasklist, in turn, may import_role).
Ok, It is a bug as #George Shuklin posted.
I will use my shell wrapper, which reads group_vars yaml and then runs the playbook multiple times according to the variable list length.
Sadly I hit multiple annoying bugs in ansible in last few weeks, kinda losing my trust in it ):
And probably everybody is using microservices and kubernetes, so need to speed up our migration (:

Ansible group variable evaluation with local actions

I have an an Ansible playbook that includes a role for creating some Azure cloud resources. Group variables are used to set parameters for the creation of those resources. An inventory file contains multiple groups which reference that play as a descendant node.
The problem is that since the target is localhost for running the cloud actions, all the group variables are picked up at once. Here is the inventory:
[cloud:children]
cloud_instance_a
cloud_instance_b
[cloud_instance_a:children]
azure_infrastructure
[cloud_instance_b:children]
azure_infrastructure
[azure_infrastructure]
127.0.0.1 ansible_connection=local ansible_python_interpreter=python
The playbook contains an azure_infrastructure play that references the actual role to be run.
What happens is that this role is run twice against localhost, but each time the group variables from cloud_instance_a and cloud_instance_b have both been loaded. I want it to run twice, but with cloud_instance_a variables loaded the first time, and cloud_instance_b variables loaded the second.
Is there anyway to do this? In essence, I'm looking for a pseudo-host for localhost that makes it think these are different targets. The only way I've been able to workaround this is to create two different inventories.
It's a bit hard to guess how you playbook look like, anyway...
Keep in mind that inventory host/group variables are host-bound, so any host always have only one set of inventory variables (variables defined in different groups overwrite each other).
If you want to execute some tasks or plays on your control machine, you can use connection: local for plays or local_action: for tasks.
For example, for this hosts file:
[group1]
server1
[group2]
server2
[group1:vars]
testvar=aaa
[group2:vars]
testvar=zzz
You can do this:
- hosts: group1:group2
connection: local
tasks:
- name: provision
azure: ...
- hosts: group1:group2
tasks:
- name: install things
apk: ...
Or this:
- hosts: group1:group2
gather_facts: no
tasks:
- name: provision
local_action: azure: ...
- name: gather facts
setup:
- name: install things
apk:
In this examples testvar=aaa for server1 and testvar=zzz for server2.
Still azure action is executed from control host.
In the second example you should turn off fact gathering and call setup manually to prevent Ansible from connecting to possibly unprovisioned servers.

Can I force current hosts group to be identified as another in a playbook include?

The current case is this:
I have a playbook which provisions a bunch of servers and installs apps to these servers.
One of these apps already has it's own ansible playbook which I wanted to use. Now my problem arises from this playbook, as it's limited to hosts: [prod] and the host groups I have in the upper-level playbook are different.
I know I could just use add_host to add the needed hosts to a prod group, but that is a solution which I don't like.
So my question is: Is there a way to add the current hosts to a new host group in the include statement?
Something like - include: foo.yml prod={{ ansible_host_group }}
Or can I somehow include only the tasks from a playbook?
No, there's no direct way to do this.
Now my problem arises from this playbook, as it's limited to
hosts: [prod]
You can setup host's more flexible via extra vars:
- name: add role fail2ban
hosts: '{{ target }}'
remote_user: root
roles:
- fail2ban
Run it:
ansible-playbook testplaybook.yml --extra-vars "target=10.0.190.123"
ansible-playbook testplaybook.yml --extra-vars "target=webservers"
Is this workaround suitable for you?

Resources