Prevent duplicate key warnings in Ansible 2 - ansible

I use a lot of YAML anchors and references in my roles to keep the logic in a single spot instead of repeating myself in multiple tasks. Following is a very very basic example.
- &sometask
name: "Some Task"
some_module: with a lot of parameters
with_items: list_A
- <<: *sometask
name: "Some OTHER Task"
with_items: list_B
This example might not show how this is actually useful, but it is. Imagine you loop over a list of dicts, passing various keys from each dict to the module, maybe having quite complex "when", "failed_when" and "changed_when" conditions. You simply want to DRY.
So instead of defining the whole task twice, I use an anchor to the first one and merge all its content into a new task, then override the differing pieces. That works fine.
Just to be clear, this is basic YAML functionality and has nothing to do with Ansible itself.
The result of above definition (and what Ansible sees when it parsed the YAML file) would evaluate to:
- name: "Some Task"
some_module: with a lot of parameters
with_items: list_A
- name: "Some Task"
some_module: with a lot of parameters
with_items: list_A
name: "Some OTHER Task"
with_items: list_B
Ansible 2 now has a feature to complain when keys have been defined multiple times in a task. It still works, but creates unwanted noise when running the playbook:
TASK [Some OTHER Task] *******************************************************
[WARNING]: While constructing a mapping from /some/file.yml, line 42, column 3, found a duplicate dict key (name). Using last defined value only.
[WARNING]: While constructing a mapping from /some/file.yml, line 42, column 3, found a duplicate dict key (with_items). Using last defined value only.
Ansible configuration allows to prevent deprecation_warnings and command_warnings. Is there a way to also prevent this kind of warning?

Coming in late here I'm going to disagree with the other answers and endorse YAML merge. Playbook layout is highly subjective and what's best for you depends on the config you need to describe.
Yes, ansible has merge-like functionality with includes or with_items / with_dict loops.
The use case I've found for YAML merge is where tasks have only a few outliers, therefore a default value which can be overridden is the most compact and readable representation. Having ansible complain about perfectly valid syntax is frustrating.
The comment in the relevant ansible code suggests The Devs Know Better than the users.
Most of this is from yaml.constructor.SafeConstructor. We replicate it here so that we can warn users when they have duplicate dict keys (pyyaml silently allows overwriting keys)
PyYAML silently allows "overwriting" keys because key precedence is explicitly dealt with in the YAML standard.

As of Ansible 2.9.0, this can be achieved by setting the ANSIBLE_DUPLICATE_YAML_DICT_KEY environment variable to ignore. Other possible values for this variable are warn, which is the default and preserves the original behaviour, and error, which makes the playbook execution fail.
See this pull request for details about the implementation.

There is also a system_warnings configuration option, but none of these will silence that output you're seeing.
Here is the code generating that message from
ansible/lib/ansible/parsing/yaml/constructor.py
if key in mapping:
display.warning('While constructing a mapping from {1}, line {2}, column {3}, found a duplicate dict key ({0}). Using last defined value only.'.format(key, *mapping.ansible_pos))
While your use of YAML references is quite clever I doubt this is going to change anytime soon as a core tenant of Ansible is the human readability of the playbooks and tasks. Blocks will help with the repetition of conditionals on tasks, although they seem to be limited to tasks within a playbook at this time..
You could always submit a pull request adding an option for disabling these warnings and see where it goes.

To create reusable functionality at the task level in Ansible you should look into task includes. Task includes will allow you more freedom to do things like iterate using with_items, etc. At my employer we use anchors/references liberally, but only for variables. Given several existing ways of creating reusable tasks in Ansible such as task includes, playbook includes, and roles we have no need to use anchors/references for tasks the way you have described.
If you just want the module args to be copied between tasks you could go the templating route:
args_for_case_x: arg1='some value' arg2=42 arg3='value3'
- name: a task of case x for a particular scenario
the_module: "{{ args_for_case_x }}"
when: condition_a
- name: a different use of case x
the_module: "{{ args_for_case_x }}"
when: condition_b
As you can see though, this doesn't easily support varying the args based on loop iteration which you could get if you used one of the aforementioned reuse features.

Related

Alternative ansible syntax

I've seen a playbook which looks something like this:
- hosts:
- foo
- bar
roles:
- role: whatever
It works, but from the documentation I would have expected that:
a. Hosts would be given as a single space separated line e.g.:
- hosts: foo bar
rather than a list.
b. The value for the "roles" key in the play would be a list, e.g.:
roles:
- whatever
rather than a key:value pair.
Can someone explain what I'm missing either in yaml which makes these alternatives equivalent once parsed, or where in the ansible docs it explains these alternative definitions?
TL;DR
For hosts use the syntax that you and the other people working with this are most comfortable with.
For roles, you need the role: <name> syntax only in cases where you want to also set other attributes for the role.
Longer answer
I have wondering about this occasionally as well.
In the docs section Intro to Playbooks, Basics, it says:
The hosts line is a list of one or more groups or host patterns, separated by colons, as described in the Working with Patterns documentation.
It does, however, not mention explicitly that this list, could also be a space separated string.
As far as the roles attribute of a play is concerned, I think the alternate syntax variant is straight forward. If you just pass a name (a single string), then this is obviously the name of the role.
If you want to pass additional arguments, like variables, then you need to create a dictionary. See an example of the two syntaxes used together here in the docs (search for "Roles can accept other keywords").
The definite answer to both questions is in the source code:
Here is the part that parses the hosts list in a play:
https://github.com/ansible/ansible/blob/devel/lib/ansible/playbook/play.py#L104-L116
Here is the part that does it for a role in roles:
https://github.com/ansible/ansible/blob/devel/lib/ansible/playbook/role/definition.py#L68-L135
There is another hint in the playbook/base.py#preprocess_data:
infrequently used method to do some pre-processing of legacy terms
The Play class for example inherits / overrides this method, directly below the snippet I linked to above.

Remove Config Lines on ASA with Ansible

I have an ansible playbook that creates a network object and sets ACL policies. It's working well, but I would like to create the complementary playbook to remove the object and its associated config but I don't know the correct way to approach the task.
I could just use asa_command to issue the 'no' prefix for the appropriate lines, however, that doesn't feel like the "Ansible Way" since it would try to execute the commands even if they were already absent in the config.
I have seen that some modules have a state: absent operator. However, the asa_ modules don't indicate that as an option.
Any suggestions would be much appreciated.
I think having a state: absent option makes a lot of sense, as I don't think there is a simple way of doing this more efficiently with the current asa_ modules. The Ansible team is extremely responsive to issues and PRs, so I would submit one for this feature.
It looks like there isn't a clean way to do this as of Ansible 2.4. I have a working playbook, however, I had to settle for issuing the no commands using asa_config and putting ignore_errors: yes in for each play. It's inelegant to say the least and in some cases can break down. I think there may be a way to use an error handling along with check_mode: yes. My initial attempt at this failed because when registering the result of a play to a variable, I cannot use that variable to interpret which of the affected hosts actually required a change it's just a generic yes/no for the entire play.
What I'm doing currently:
- name: Remove Network Object
asa_config:
commands:
- no object network {{ object_name }}
provider: "{{ cli }}"
ignore_errors: yes
register: dno

Ansible: using nested groups with vars

I have a situation where we have 3 tiers of boxes, in each tier we apply different variables settings (like where the cache dir is), but there are a bunch of defaults. I also need to override on a per node basis, which is usually done via inventory vars on the host itself. I am not sure what is the best way to organize the hosts so that the precedence works in my favor.
Here are the different things I have tried. In each case I have entries in the inventory file like this:
[bots-fancy]
fancy-1
[bots-super-fancy]
super-1
[bots-magical]
magic-1
magic-2 provider=aws
At first, I had each of them with a long string of variable definitions. I also had different group_var/bots/[bots-magical | bots-super-fancy | bots-fancy].yaml files. This quickly became untenable.
attempt 1: with playbook variables
In the playbook I had something like this:
---
hosts:
- bots
vars_files:
- "group_vars/bots/defaults.yml"
- "group_vars/bots/{{ groups_names[0] }}.yml"
roles:
- somethign
this worked (though yes brittle) but it wouldn't let me override on a per host basis. I had to set things different on nodes occasionally, but not on the whole group.
attempt 2: using group_vars for each
I added
[bots:children]
bots-fancy
bots-super-fancy
bots-magical
to the hosts file. Removed any vars_files from the playbook and created group_vars for each group. I added the default/shared settings to group_vars/bots.yaml. When I'd run the playbook, it would only load the bots group_vars it seemed. Ideally, I want it to load the bots and then override it with the bots-fancy. And then finally the values from the hosts file.
I am not sure the best way to structure these groups, so any input would be very helpful!
Not sure what is your problem. You should be fine with:
hosts:
[bots-a]
bot1
[bots-b]
bot2
[bots:children]
bots-a
bots-b
dirs:
./group_vars/bots.yml
./group_vars/bots-a.yml
./group_vars/bots-b.yml
There is a concept of group depth in Ansible (at least in recent versions). In this example, group variables for host bot2 will be populated in the following order:
depth 0: group all, all.yml (missing here, ignoring)
depth 1: group bots, bots.yml
depth 2: group bots-b, bots-b.yml
You can see details and processing order here in the source code.
So if you define defaults in bots.yml and specific values in bots-b.yml, you should achieve what you expect.

How to loop over playbook include?

(I'm currently running Ansible 2.1)
I have a playbook that gathers a list of elements and I have another playbook (that calls different hosts and whatnot) using said element as the basis for most operations. Therefore, whenever I use with_items over the playbook, it causes an error.
The loop control section of the docs say that "In 2.0 you are again able to use with_ loops and task includes (but not playbook includes) ". Is there a workaround? I really need to be able to call multiple hosts in an included playbook that runs over a set of entries. Any workarounds, ideas for such or anything are greatly appreciated!
P.S. I could technically command: ansible-playbook but I dont want to go down that rabbit hole if necessary
I think I faced same issues, and by the way, migrating to shows more than in 'item' already in use.
refering to http://docs.ansible.com/ansible/playbooks_best_practices.html , you should have an inventory (that contains all your hosts), and a master playbook (even if theorical).
A good way, instead of including playbooks, is to design roles, even if empty. Try to find a "common" role for everything that could be applied to most of your hosts.Then, include additional roles depending of usage, this will permit you to trigg on correct hosts.
You can also have roles that do nothing (meaning, nothing in 'tasks'), but that contain set of variables that can be common for two roles (you avoid then duplicate entries).

What's the difference between defaults and vars in an Ansible role?

When creating a new Ansible role, the template creates both a vars and a defaults directory with an empty main.yml file. When defining my role, I can place variable definitions in either of these, and they will be available in my tasks.
What's the difference between putting the definitions into defaults and vars? What should go into defaults, and what should to into vars? Does it make sense to use both for the same data?
I know that there's a difference in precedence/priority between the two, but I would like to understand what should go where.
Let's say that my role would create a list of directories on the target system. I would like to provide a list of default directories to be created, but would like to allow the user to override them when using the role.
Here's what this would look like:
---
- directories:
- foo
- bar
- baz
I could place this either into the defaults/main.yml or in the vars/main.yml, from an execution perspective, it wouldn't make any difference - but where should it go?
The Ansible documentation on variable precedence summarizes this nicely:
If multiple variables of the same name are defined in different places, they win in a certain order, which is:
extra vars (-e in the command line) always win
then comes connection variables defined in inventory (ansible_ssh_user, etc)
then comes "most everything else" (command line switches, vars in play, included vars, role vars, etc)
then comes the rest of the variables defined in inventory
then comes facts discovered about a system
then "role defaults", which are the most "defaulty" and lose in priority to everything.
So suppose you have a "tomcat" role that you use to install Tomcat on a bunch of webhosts, but you need different versions of tomcat on a couple hosts, need it to run as different users in other cases, etc. The defaults/main.yml file might look something like this:
tomcat_version: 7.0.56
tomcat_user: tomcat
Since those are just default values it means they'll be used if those variables aren't defined anywhere else for the host in question. You could override these via extra-vars, via facts in your inventory file, etc. to specify different values for these variables.
Edit: Note that the above list is for Ansible 1.x. In Ansible 2.x the list has been expanded on. As always, the Ansible Documentation provides a detailed description of variable precedence for 2.x.
Role variables defined in var have a very high precedence - they can only be overwritten by passing them on the command line, in the specific task or in a block. Therefore, almost all your variables should be defined in defaults.
In the article "Variable Precedence - Where To Put Your Role Vars" the author gives one example of what to put in vars: System-specific constants that don't change much. So you can have vars/debian.yml and vars/centos.yml with the same variable names but different values and include them conditionally.
IMHO it is impractical and not sensible that Ansible places such high priority on configuration in vars of roles. Configuration in vars/main.yml and defaults/main.yml should be low and probably the same priority.
Are there any real life examples of cases where we want this type of behavior?
There are examples that we dont' want this.
The point to make here is that configuration in defaults/main.yml cannot be dynamic. Configuration in vars/main.yml can. So for example you can include configuration for specific OS and version dynamically as shown in geerlingguy.postgresql
But because precedence is so strange and impractical in Ansible geerlingguy needs to introduce pseudo variables as can be seen in variables.yml
- name: Define postgresql_packages.
set_fact:
postgresql_packages: "{{ __postgresql_packages | list }}"
when: postgresql_packages is not defined
This is a concrete real life example that demonstrates that the precedence is impractical.
Another point to make here is that we want roles to be configurable. Roles can be external, managed by someone else. As a general rule you don't want configuration in roles to have high priority.
Basically, anything that goes into “role defaults” (the defaults folder inside the role) is the most malleable and easily overridden. Anything in the vars directory of the role overrides previous versions of that variable in namespace. The idea here to follow is that the more explicit you get in scope, the more precedence it takes with command line -e extra vars always winning. Host and/or inventory variables can win over role defaults, but not explicit includes like the vars directory or an include_vars task.
doc
Variables and defaults walk hand in hand. here's an example
-name: install package
yum: name=xyz{{package_version}} state=present
in your defaults file you would have something like:
package_version: 123
What ansible will do is, it's gonna take the value of package_version and put it next to the package name so it will read somewhere as:
-name: install package
yum: name=xyz123 state=present
This way it will install xyz123 and not xyz123.4 or whatever is in the great repository of xyz's.
At the end it will do yum install -y xyz123
So basically the defaults are the values present, if you do not set a specific value for the variables, cause that space can't stay empty.

Resources