Background: I want to write a ansible playbook that works with different OS. One of the basic thing it does is to install packages using package command (newly introduced in ansible 2.0).
The problem is the list of packages are different on each OS. I would like to specify common ones somewhere, and then place Ubuntu specific ones and CentOS specific ones separately. Then for CentOS ones, CentOS6 and CentOS7 also have some common ones and different ones, so I would like to put common ones for CentOS somewhere, and CentOS6 and CentOS7 specific ones somewhere else.
The immediate way jumps into my mind is to have group_vars, however, the way I know wouldn't work. Suppose under group_vars I have files all, ubuntu, centos, centos6 and centos7 to define variables, the order of which takes precedence is indeterministic. For a CentOS6 host, it is in groups all, centos and centos6. There is no way to specify centos6 should take precedence over centos and centos is over all.
Another way I found was to set hash_behaviour = merge in ansible.cfg, however, this modifies the behavior globally, which overkills. There might be cases where default behavior is wanted, so modifying it globally is not a nice approach.
How should I organize everything in a clean way, and share and re-use common things as much as possible?
OS condition inside the role
One common way that I use is to define OS dedicated default vars files inside the vars directory of the role and then include them at the beginning of the main.yml task. Ansible helps me to automatically identify the currently used distribution with {{ ansible_distribution }} and {{ ansible_distribution_version }}, so I just have to name the OS dedicated yml files accordingly.
Role dir tree:
my_role/
├── handlers
│ ├── main.yml
│ └── Ubuntu14.04.yml
├── tasks
│ ├── main.yml
│ └── Ubuntu14.04.yml
├── templates
│ └── Ubuntu14_04.config.j2
└── vars
└── Ubuntu14.04.yml
In the main.yml you first include default OS specific variables and then run OS specific tasks:
tasks/main.yml:
---
- name: include distribution specific vars
include_vars: "{{ ansible_distribution }}{{ ansible_distribution_version }}.yml"
- name: include distribution specific install
include: "{{ ansible_distribution }}{{ ansible_distribution_version }}.yml"
- name: Do general stuff
...
Now you can easily define your packages inside the vars file for each distribution, e.g., Ubuntu14.04:
vars/Ubuntu14.04.yml:
---
my_role_packages:
- a
- b
And finally install them in your distribution specific setup task.
tasks/Ubuntu14.04.yml:
---
- name: Install packages for Ubuntu14.04
apt:
name={{ item }}
state=present
update_cache=yes
with_items: "{{ my_role_packages }}"
You can now also easily define OS specific handlers and templates, e.g.,:
handlers/main.yml:
---
- name: distribution specific handler
include: "{{ ansible_distribution }}{{ ansible_distribution_version }}.yml"
Hash Merging
About globally setting hash_behaviour=merge, here a quote from the official Ansible documentation:
Some users prefer that variables that are hashes (aka ‘dictionaries’
in Python terms) are merged. This setting is called ‘merge’.
We generally recommend not using this setting unless you think you
have an absolute need for it, and playbooks in the official examples
repos do not use this setting.
Originally I come from the SaltStack world and was used to merging hashes from my defaults map.jinja with the dedicated pillars, but in Ansible I started relying more on variable prefixes, so instead of
nginx:
pkg: "nginx"
repo: "deb http://nginx.org/packages/ubuntu/ trusty nginx"
I'd write
nginx_pkg: "nginx"
nginx_repo: "deb http://nginx.org/packages/ubuntu/ trusty nginx"
in order to avoid accidentally overwriting hashes when going up the variable hierarchy. If in some cases you still prefer merging, you could use the jinja2 combine filter in your variable files: dict_a: {{ dict_b|combine(dict_c) }}.
Variable grouping
The Ansible documentation puts a lot of emphasize on heavily using group_vars and I found that to be good advice. A general approach here is to define my groups in my inventories files, like:
hosts/test:
[app]
app-test-a
app-test-b
[app_test:children]
app
hosts/live:
[app]
app-live-a
app-live-b
[app_live:children]
app
Now I can easily use group_vars to include variables based on the defined groups:
group_vars/
├── all.yml
├── app_test.yml
├── app_live.yml
└── app.yml
Ansible Galaxy and DebObs
I also recommend checking out Ansible Galaxy roles. They are always a good starting point to get some ideas (including this one). Another good source of inspiration is the DebObs repository.
Related
I have the following directory structure:
├── ansible.cfg
├── hosts.yml
├── playbook.yml
├── group_vars
| ├── all.yml
│ └── vm_dns.yml
└── roles
└── pihole
├── handlers
│ └── main.yml
└── tasks
└── main.yml
In ansible.cfg I simply have:
[defaults]
inventory = ./hosts.yml
In group_vars/all.yml I have some generic settings:
---
aptcachetime: 3600
locale: "en_GB.UTF-8"
timezone: "Europe/Paris"
And in hosts.yml I setup my PiHole VMs:
---
all:
vars:
ansible_python_interpreter: /usr/bin/python3
vm_dns:
vars:
dns_server: true
hosts:
vmb-dns:
pihole:
dns:
- "185.228.168.10"
- "185.228.169.11"
network:
ipv4: "192.168.2.4/24"
interface: eth0
vmk-dns:
pihole:
dns:
- "185.228.168.10"
- "185.228.169.11"
network:
ipv4: "192.168.3.4/24"
interface: eth0
At this point, I've not attempted to move any vars to group_vars, and everything works.
Now, I felt could make the hosts file more readable by breaking out the settings that are the same for all vm_dns hosts to a group_vars file. So I removed all the dns and interface lines from hosts.yml, and put them in a
group_vars/vm_dns.yml file, like this:
---
pihole:
dns:
- "185.228.168.10"
- "185.228.169.11"
network:
interface: eth0
At this point, the hosts.yml thus contains:
---
all:
vars:
ansible_python_interpreter: /usr/bin/python3
vm_dns:
vars:
dns_server: true
hosts:
vmb-dns:
pihole:
network:
ipv4: "192.168.2.4/24"
vmk-dns:
pihole:
network:
ipv4: "192.168.3.4/24"
But when I now run the playbook, once it tries to execute a task that uses one of the vars that were moved from hosts.yml to group_vars/vm_dns.yml, Ansible fails with AnsibleUndefinedVariable: dict object has no attribute ....
I'm not really sure if I'm simply misunderstanding the "Ansible way", or if what I'm trying to do (essentially having different parts of the same list split across hosts and group_vars, I suppose) is not just doable. I thought the "flattening" that Ansible does was supposed to handle this, but it seems Ansible is not incorporating the vars defined in group_vars/vm_dns.yml at all.
I've read the docs on the subject, and found some almost-related posts, but found none demonstrating YAML-formatted lists used across hosts and group_vars in this manner.
Edit: other SO or Github issues that are actually related to this question
In Ansible, how to combine variables from separate files into one array?
https://github.com/ansible/ansible/issues/58120
https://docs.ansible.com/ansible/latest/reference_appendices/config.html#default-hash-behaviour
Since you are keeping a definition for the pihole var in your inventory at host level, this one wins the game by default and replaces the previous definition at group level. See the variable precedence documentation. So if you later try to access e.g. pihole.dns or pihole.network.interface, the mappings do not exist anymore and ansible fires the above error.
This is the default behavior in ansible: replacing a previous variable by the latest by order of precedence. But you can change this behavior for dicts by setting hash_behaviour=merge in ansible.cfg.
My personal experimentation with this settings where not really satisfactory: it behaved correctly with my own playbooks/roles that where made specifically for this but started to fire hard to trace bugs when including third party contributions (playbook snippets, roles, custom modules....). So I definitely don't recommend it. Moreover, this configuration has been deprecated in ansible 2.10 and will therefore be removed in ansible 2.14. If you still want to use it, you should limit the scope of the setting as narrow as possible and certainly not set it on a global level (i.e. surely not in /etc/ansible/ansible.cfg)
What I globally use nowadays to solve this kind of problems:
define your variable for each host/group/whatever containing only the specific information. In your case for you host
---
pihole_host:
network:
ipv4: "192.168.2.4/24"
define somewhere the defaults for those settings. In your case for your group.
---
pihole_defaults:
dns:
- "185.228.168.10"
- "185.228.169.11"
network:
interface: eth0
(Note that you can define those defaults at different level taking advantage of the above order of precedence for vars)
at a global level (I generally put this in group_vars/all.yml), define your var which will be the combination of default and specific, making sure it always defaults to empty
---
# Calculate pihole from group defaults and host specific
pihole: >-
{{
(pihole_defaults | default({}))
| combine((pihole_host | default({})), recursive=true)
}}
I know that you can change between different inventory files using the -i flag which can be used to switch between different hosts.
In my case, the hosts to be acted upon change between deployments, so I take the hosts in as --extra-vars and use delegate_to to deploy to that host (see below for more details).
I was hoping for a way to switch between files containing environment variables in a similar fashion. For example, lets say I have the following simplified directory structure:
/etc/ansible/
├── ansible.cfg
├── hosts
└── project/
└── environments/
├── dev/
│ └── vars.yml
└── prd/
└── vars.yml
The structure of vars.yml in both environments would be exactly the same, just with the variables having different values due to the differences between environments.
I've found a few places that talk about doing something similar such as these:
https://rock-it.pl/managing-multiple-environments-with-ansible-best-practices/
http://rosstuck.com/multistage-environments-with-ansible
http://www.geedew.com/setting-up-ansible-for-multiple-environment-deployments/
In those guides, they act against statically declared hosts. One thing that help me seems to be the directories called group_vars. It looks like the inventory points to the config with the same name, and assumingly uses those variables when the hosts: directive of a play contains the host(s) specified in the inventory header.
However, Since I dynamically read in the servers that we're acting against via the CLI flag --extra-vars, I can't take that approach because I will always have something like this for my plays:
...
hosts: localhost
tasks:
...
- name: do something
...
delegate_to: {{ item }}
with_items: {{ given_hosts }}
Or I run a task first that takes the servers and adds them to a new host like this:
- name: Extract Hosts
hosts: localhost
tasks:
- name: Adding given hosts to new group...
add_host:
name: "{{ item }}"
groups: some_group
with_items:
- "{{ list_of_hosts | default([]) }}"
and then uses the dynamically created group:
- name: Restart Tomcat for Changes to Take Effect
hosts: some_group
tasks:
- name: Restarting Tomcat...
service:
name: tomcat
state: restarted
So I need to find a way to specify which vars.yml to use. Because I use Jenkins to kick off the Ansible playbook via CLI over SSH, I was hoping for something like the following:
ansible-playbook /path/to/some/playbook.yml --include-vars /etc/ansible/project/dev/vars.yml
At the least, how would I explicitly include a vars.yml file in a playbook to use the variables defined within?
You can use:
extra vars with #: --extra-vars #/etc/ansible/project/dev/vars.yml
or
include_vars:
- include_vars: "/etc/ansible/project/{{ some_env }}/vars.yml"
to load different variables depending in your environment.
I have a playbook organized as follows (simplified for the sake of this question):
├── deploy.yml
├── hosts
├── requirements.yml
├── roles
│ └── web
│ ├── meta
│ │ └── main.yml
│ └── tasks
│ └── main.yml
└── site.retry
My simplified deploy.yml is:
---
- name: Everything I need
hosts: somewhere
roles:
- web
And my simplified roles/web/tasks/main.yml is
---
- name: Various things that work
become: yes
[whatever]
- name: the thing that I have a problem with
become: yes
davidedelvento.nbextension: name=foo state=present
This fails with:
ERROR! no action detected in task. This often indicates a misspelled module name, or incorrect module path.
So I tried to change roles/web/tasks/main.yml to
---
- name: Various things that work
become: yes
[whatever]
- name: the thing that I have a problem with
become: yes
roles:
- { role: davidedelvento.nbextension, name: foo, state: present}
which fails in the same way. I understand the failure (since I cannot call a role from a task, which instead I'm doing -- but the error could be clearer....)
However I am not clear how can I accomplish what I'd like, namely doing whatever nbextension is doing at that point in time. I could move that role from roles/web/tasks/main.yml to roles/web/meta/main.yml and that works, but it is executed before the Various things that work and I need it to be executed after. How to accomplish that?
Note that I wrote nbextension, however the same problem happen with similar other roles from the galaxy.
EDIT: Note also that the extension is correctly installed and can be used from a standalone, single-file playbook such as
---
- name: Example
hosts: all
become: yes
roles:
- { role: davidedelvento.nbextension, name: foo, state: present}
however I need it to "integrate" in the larger project described above for the "web" role (I have more roles that I'm not showing)
EDIT2: note that the galaxy ansible role used for this question has been renamed to jupyterextension but as I said the issue (and solution) is the same for any role
Ok, so I've found two ways to deal with this issue.
split the role in two (or more) parts and use the galaxy's role as a dependency to the things that it needs to prepend. In general, I like this idea, but in my particular use case I don't, since I would need create 3 roles for something that is really one.
Use include_role module, with the caveat that at the moment it is "flagged as preview", namely it is not guaranteed to have a backwards compatible interface. However it works quite well for my current setup:
- name: the thing that I have not a problem with anymore
become: yes
include_role:
name: davidedelvento.nbextension
with_items:
- foo
- bar
loop_control:
loop_var: name
I'm designing a kind of playbook lib with individual tasks
so in the usual roles repo, I have something like:
roles
├── common
│ └── tasks
│ ├── A.yml
│ ├── B.yml
│ ├── C.yml
│ ├── D.yml
│ ├── login.yml
│ ├── logout.yml
│ └── save.yml
├── custom_stuff_workflow
│ └── tasks
│ └── main.yml
└── other_stuff_workflow
└── tasks
└── main.yml
my main.yml in custom_stuff_workflow then contain something like:
---
- include: login.yml
- include: A.yml
- include: C.yml
- include: save.yml
- include: logout.yml
and this one in the other workflow:
---
- include: login.yml
- include: B.yml
- include: A.yml
- include: D.yml
- include: save.yml
- include: logout.yml
I can't find a way to do it in a natural way:
one way that worked was having all tasks in a single role and tagging the relevant tasks while including a custom_stuff_workflow
The problem I have with that is that tags cannot be set in the calling playbook: it's only to be set at command line
as I'm distributing this ansible repo with many people in the company, I can't rely on command line invocations (it would be nice to have a #! header in yml to be processed by ansible-playbook command)
I could also copy the relevant tasks (inside common in the above tree) in each workflow, but I don't want to repeat them around
Can someone see a solution to achieve what I'd like without repeating the tasks over different roles?
I guess the corner stone of my problem is that I define tasks as individual and it looks not natural in ansible...
Thanks a lot
PS: note that the tasks in the workflow have to be done in specific order and the only natural steps to abstract would be the login and save/logout
PPS: I've seen this question How do I call a role from within another role in Ansible? but it does not solve my problem as it's invoking a full role and not a subset of the tasks in a role
Just in case someone else bumps into this, version 2.2 of Ansible now has include_role. You can now do something like this:
---
- name: do something
include_role:
name: common
tasks_from: login
Check out the documentation here.
Yes, Ansible doesn't really like tasks as individual components. I think it wants you to use roles, but I can see why you wouldn't want to use roles for simple, reusable tasks.
I currently see two possible solutions:
1. Make those task-files into roles and use dependencies
Then you could do something like this in e.g. custom_stuff_workflow
dependencies:
- { role: login }
See: https://docs.ansible.com/playbooks_roles.html#role-dependencies
2. Use include with "hardcoded" paths to the task files
- include: ../../common/tasks/login.yml
That worked pretty well in a short test playbook I just did. Keep in mind, you can also use parameters etc. in those includes.
See: http://docs.ansible.com/ansible/latest/playbooks_reuse.html
I hope I understood that question correctly and this helps.
Using include_role: with option tasks_from is a good idea. However this still includes parts of the role. For example it loads role vars and meta dependencies. If apply is used to apply tags to an included file, then same tags are applied to meta dependencies. Also, the ansible output lists as the included role's name in its output, which is confusing.
It is possible to dynamically locate a role and include a file using first_found. One can find the role path searching DEFAULT_ROLES_PATH and load a file from tasks folder. Ansible uses the same variable when sarching a role, so long as the role is in a path that Ansible can find, then the file will be loaded.
This method is as dynamic as using include_role with option tasks_from
Example:
- name: Include my_tasks.yml from my_ansible_role
include_tasks: "{{lookup('first_found', params)}}"
vars:
params:
files: my_ansible_role/tasks/my_tasks.yml
paths: "{{ lookup('config', 'DEFAULT_ROLES_PATH') }}"
You can use the built-in variable playbook_dir to include tasks from other roles.
- name: "Create SSL certificates."
include_tasks: "{{ playbook_dir }}/roles/common/tasks/ssl.yml"
with_items: "{{ domains }}"
The ssl.yml is in the common role. It can have any number of tasks; also, passing variables like the example is possible.
I've been playing around with ansible (I'm very new to configuration management in general, it's not what I do for my
dayjob), and I'm trying to figure out the best way to pattern my conditional tasks.
I've structured my setup (git repository with the following layout:
├── dotfiles (...)
├── bin (...)
└── playbooks
├── bootstrap.yml
├── firewall.yml
├── ids.yml
├── logserver.yml
├── pvr.yml
├── site.yml
├── log
└── roles
└── bootstrap
| ├── defaults
| │ └── main.yml
| └── tasks
| ├── apt-update.yml
| ├── apt-upgrade.yml
| ├── homebrew-user.yml
| ├── homebrew-install.yml
| ├── homebrew-update.yml
| ├── homebrew-upgrade.yml
| ├── main.yml
| ├── tmux-install.yml
| └── vim-install.yml
├── elk (...)
├── snort (...)
├── tor (...)
└── zentyal (...)
For most scenarios, I've got a playbook for each system that I'm using to buildout my roles. I run the playbook locally
on each host (on my firewall system, i log in, update the git repository, and run ansible-playbook bootstrap.yml and
ansible-playbook firewall.yml. My intent is to setup a real inventory in the future so that I can run specific roles
against specific hosts, but for now while I'm learning, this works well.
Most of the roles are platform specific and require sudo, and that's fine and works well.
However my bootstrap.yml playbook needs to run in variable environments:
sometimes install homebrew on an osx system using sudo (multi-user environment)
sometimes install homebrew on an osx system without sudo (single-user environment)
sometimes install additional homebrew applications in a specific location (single-user desktop system)
sometimes install additional homebrew applications in a specific location (multiuser desktop system)
sometimes install specific applications system-wide using package control on Ubuntu/BSD systems
sometimes install specific applications using local build scripts on Ubuntu/BSD systems in $HOME
Depending on the environment, there's up to 100 applications or so that I'd like to bootstrap - all the various
utilities that I use for daily things - copying over my dotfiles, installing/updating bash/zsh, git,
git-flow, hub, vim, tmux, mosh, weechat, base16, source-highlight, netcat, nmap, setting up pip, virtualenv,
virtualenvwrapper, and about 100 other things.
As a single example, a task for installing tmux in multiple environments (in python-like psuedocode):
if ansible_distribution=='MacOSX':
if sudo_user is not None:
homebrew: name=tmux state=present sudo_user='homebrew'
else:
homebrew: name=tmux state=present
elif os=='Ubuntu':
if sudo_user is not None:
apt_repository: name='ppa:pi-rho' state: present)
apt: name=tmux state=present sudo_user='root')
else:
shell: tmux-ubuntu-build-local.sh
elif os=='BSD':
shell: tmux-bsd-build-local.sh
But ansible doesn't seem to have basic if/else constructs available - just the when construct. So , I started using
when to write conditional tasks (included by ./playbooks/roles/bootstrap/main.yml):
---
# ./playbooks/roles/bootstrap/tmux.yml
# this always installs without sudo, which works on my laptop
# but fails on shared systems that use a 'homebrew' user
- name: install
when: ansible_distribution == 'MacOSX'
sudo: no
homebrew: name=tmux
# i only want to run this when I *want* to install tmux globally
# on some systems I just want to run a build script
# if i don't have root and visudo isn't properly configured, this fails
- name: add pi-rho/dev ppa
when: ansible_distribution == 'Ubuntu'
sudo: yes
apt_repository:
repo: 'ppa:pi-rho/dev'
state: present
# again, i don't always want to do this
- name: update cache
when: ansible_distribution == 'Ubuntu'
sudo: yes
apt: update_cache=yes
# this always attempts to install using sudo, which I don't want.
# on some systems I want to run a build script that installs tmux
# to the user's home directory instead
- name: install
when: ansible_distribution == 'Ubuntu'
sudo: yes
apt: name=tmux
# this installs a set of tmux plugins without sudo to my $HOME
# directory, i do always want this
- name: install plugins
sudo: no
git:
repo: '{{ item.repo }}'
dest: '{{ tmux_plugins_path }}/plugins/{{ item.name }}'
with_items: tmux_plugins
This approach works on a couple of my systems, but fails (as the comments note) in all but the most basic cases. It also
scales very poorly - as I don't see a way to "reuse" any of the tasks or to use multiple/chained when expressions
with a single task.
I've started playing with several other solutions I've read about:
Pass in a flag using -e, for example: use_sudo, but again, without "task reuse" I'm going to have to write
individual tasks for each application/library to account for the flag. And if I also did this with the operating
system (-e os=macosx), this scales even worse.
Tag all my tasks at the role level - this required me to split out tasks into separate roles by OS rather than
"topic", which was counter-interuitive, and difficult to work with.
Tag all my tasks at the task level - again, without task reuse, this seems to scale very badly.
I've considered looking plugins/modules to see if I can write something up in Python to make enable if/else constructs
in a single task or "nested" tasks, and if I can't figure something more simple out, that's probably what I'll look into
next.
So, am I doing it wrong? Any suggestions?
You can use Conditional Imports
or Applying ‘when’ to roles and includes described in
http://docs.ansible.com/ansible/latest/playbooks_conditionals.html