I've just started learning Ansible and have a question about what defines a "play". If I'm looking at a temple, my understanding is a play typically starts with the following:
-
name: xxx
hosts: yyy
I've read that the name: keyword is not required (but highly recommend). So with that being the case, how can I tell where one play begins and where the next one starts? What keywords delimit/demarc a single play when there are multiple plays in a playbook?
Thanks,
Andy
Plays:
A playbook is a list of plays. A play is minimally a mapping between a
set of hosts selected by a host specifier (usually chosen by groups
but sometimes by hostname globs) and the tasks which run on those
hosts to define the role that those systems will perform. There can be
one or many plays in a playbook.
Play refers to the set (one or more) of actions (tasks) you want to execute on a set (one of more) of hosts.
As for the syntax of a Play, you can find examples for single and multiple play playbook here.
Thanks for that link franklinsjo. That cleared things up. In summary, here's the takeaway:
A play consists of:
- name: (optional, but recommended)
hosts:
tasks:
So, as you're looking through a playbook, anytime you come to the keywords hosts: and tasks:, that indicates the start of a play (along with the optional name: keyword).
Thanks,
Andy
Related
What is the difference between
vars_files directive
and
include_vars module
Which should be used when, is any of above deprecated, discouraged?
Both vars_files and include_vars are labeled as stable interfaces so neither of them are deprecated. Both of them have some commonalities but they solve different purposes.
vars_files:
vars_file directive can only be used when defining a play to specify variable files. The variables from those files are included in the playbook. Since it is used in the start of the play, it most likely implies that some other play(before this play) created those vars files or they were created statically before running the configuration; means they were kind of configuration variables for the play.
include_vars:
vars_files serves one purpose of including the vars from a set of files but it cannot handle the cases if
the vars files were created dynamically and you want to include them in play
include vars in a limited scope.
You have multiple vars files and you want to include them based on certain criteria e.g. if the local database exists then include configuration of local database otherwise include configuration of a remotely hosted database.
include_vars have higher priority than vars_files so, it can be used to override default configuration(vars).
include_vars are evaluated lazily(evaluated at the time when they are used).
You want to include vars dynamically using a loop.
You want to read a file and put all those variables in a named dictionary instead of reading all variables in the global variable namespace.
You want to include all files in a directory or some subset of files in a directory(based on prefix or exclusion list) without knowing the exact names of the vars file(s).
These are some of the cases which I can think of and if you want to use any of the above cases then you need include_vars.
vars_files are read when the play starts. include_vars are read when the play reaches the task. You probably might be interested also in Variable precedence: Where should I put a variable?
Some updated information...
import_* is considered static reuse, vars_files is a type of
import.
include_* is considered dynamic reuse.
As Vladimir mentioned,
vars_files are read when the play starts. include_vars are read when the play reaches the task
Like all static items, vars_files are read before the play starts. Unlike include_vars, which are "included" when the play reaches it.
One of the biggest differences between static reuse and dynamic reuse is how the variables or tasks within them are processed. All static reuse items are processed with the linear strategy by default, all host stay in lockstep with each other. Each tasks has to complete on ALL hosts before the next task can begin. Hosts that are skipped actually get a noop task to process.
Dynamic reuse does not change the performance strategy from linear, however it does change the order the tasks are processed. With dynamic reuse, the entire group of tasks must complete on a single host before they are process by the next host. Unfortunately, all the other hosts get to twiddle their noops while they wait.
Include statements are good when you need to 'loop' a host through a series of tasks with registered outputs and do something with that information before the before the next host starts.
Import statements are good when you need to collect information or perform a task on a group of hosts before the next task can start for any host.
Here is a really good table that compares all the different Include_* and Import_* functions.Comparing includes and imports: dynamic and static re-use
Just as an FYI, here is a link to more information about performance strategies and how you can improve performance. How can I improve performance for network playbooks?
This a common case, but it doesn't seem straight-forward in Ansible.
Let's assume of a hierarchy of groups:
linux-hosts:
application-hosts:
foobar-application-hosts:
foobar01
Now for each of these groups we want to define a set of cron jobs.
For linux-hosts, jobs that run on all linux hosts.
For application-hosts, jobs that run on only application hosts.
For foobar-applciation-hosts, jobs that run on only foobar-applcation-hosts.
The variable name is cronjobs, say, and it's a list of cron module settings.
By default, the foobar-application-hosts would clobber the setting for anything above it. Not good.
I don't see an easy way to merge (on a specific level). So I thought, all right, perhaps Ansible exposes the individual group variables for the groups a host belongs to during a run. There is groups, and there is group_names, but I don't see a groupvars corresponding to hostvars.
This seems to imply I either use to some mix-and-match of cycling over groups, dynamically importing vars (if possible), and doing the merge myself. Perhaps putting some of this in a role. But this feels like such a hack. Is there another approach?
Groups in the Ansible sense is a "tag" on hosts. Hosts can belong to more than one group. So, the conjobs var should be a list, with the same length as the number of groups that the host is in.
What is the difference between
vars_files directive
and
include_vars module
Which should be used when, is any of above deprecated, discouraged?
Both vars_files and include_vars are labeled as stable interfaces so neither of them are deprecated. Both of them have some commonalities but they solve different purposes.
vars_files:
vars_file directive can only be used when defining a play to specify variable files. The variables from those files are included in the playbook. Since it is used in the start of the play, it most likely implies that some other play(before this play) created those vars files or they were created statically before running the configuration; means they were kind of configuration variables for the play.
include_vars:
vars_files serves one purpose of including the vars from a set of files but it cannot handle the cases if
the vars files were created dynamically and you want to include them in play
include vars in a limited scope.
You have multiple vars files and you want to include them based on certain criteria e.g. if the local database exists then include configuration of local database otherwise include configuration of a remotely hosted database.
include_vars have higher priority than vars_files so, it can be used to override default configuration(vars).
include_vars are evaluated lazily(evaluated at the time when they are used).
You want to include vars dynamically using a loop.
You want to read a file and put all those variables in a named dictionary instead of reading all variables in the global variable namespace.
You want to include all files in a directory or some subset of files in a directory(based on prefix or exclusion list) without knowing the exact names of the vars file(s).
These are some of the cases which I can think of and if you want to use any of the above cases then you need include_vars.
vars_files are read when the play starts. include_vars are read when the play reaches the task. You probably might be interested also in Variable precedence: Where should I put a variable?
Some updated information...
import_* is considered static reuse, vars_files is a type of
import.
include_* is considered dynamic reuse.
As Vladimir mentioned,
vars_files are read when the play starts. include_vars are read when the play reaches the task
Like all static items, vars_files are read before the play starts. Unlike include_vars, which are "included" when the play reaches it.
One of the biggest differences between static reuse and dynamic reuse is how the variables or tasks within them are processed. All static reuse items are processed with the linear strategy by default, all host stay in lockstep with each other. Each tasks has to complete on ALL hosts before the next task can begin. Hosts that are skipped actually get a noop task to process.
Dynamic reuse does not change the performance strategy from linear, however it does change the order the tasks are processed. With dynamic reuse, the entire group of tasks must complete on a single host before they are process by the next host. Unfortunately, all the other hosts get to twiddle their noops while they wait.
Include statements are good when you need to 'loop' a host through a series of tasks with registered outputs and do something with that information before the before the next host starts.
Import statements are good when you need to collect information or perform a task on a group of hosts before the next task can start for any host.
Here is a really good table that compares all the different Include_* and Import_* functions.Comparing includes and imports: dynamic and static re-use
Just as an FYI, here is a link to more information about performance strategies and how you can improve performance. How can I improve performance for network playbooks?
This question already has answers here:
How to filter gathering facts inside a playbook?
(4 answers)
Closed 4 years ago.
For strictly the purpose of speeding up the Ansible flow, I need just a few details to be retrieved from the host. The information that I need most often is the ansible_hostname, to make sure I'm landing on the correct host as I have a dynamic DNS.
Under which gather_subset does the hostname fall?
This is about limiting the data that is gathered from the host, as opposed to the filter option.
It is possible to restrict the information gathered using gather_facts. Please check the docs of the ansible setup module on how to restrict information based on various subsets.
- hosts: my_target
gather_facts:no
pre_tasks:
- setup:
gather_subset: 'network'
tasks:
- debug: var=ansible_hostname
The available subsets from which the information can be gathered from is as follows
all, all_ipv4_addresses, all_ipv6_addresses, apparmor, architecture,
caps, chroot, cmdline, date_time, default_ipv4, default_ipv6, devices,
distribution, distribution_major_version, distribution_release,
distribution_version, dns, effective_group_ids, effective_user_id,
env, facter, fips, hardware, interfaces, is_chroot, kernel, local,
lsb, machine, machine_id, mounts, network, ohai, os_family, pkg_mgr,
platform, processor, processor_cores, processor_count, python,
python_version, real_user_id, selinux, service_mgr,
ssh_host_key_dsa_public, ssh_host_key_ecdsa_public,
ssh_host_key_ed25519_public, ssh_host_key_rsa_public,
ssh_host_pub_keys, ssh_pub_keys, system, system_capabilities,
system_capabilities_enforced, user, user_dir, user_gecos, user_gid,
user_id, user_shell, user_uid, virtual, virtualization_role,
virtualization_type
These values are mentioned in the error, when we provide an unsupported value.
The documentation mentions only a few values, between which are "min" and "any" which are not mentioned in the error.
This is a known bug:
https://github.com/ansible/ansible/issues/47603
I've joined a project which has a large number of playbooks and roles, and which makes heavy use of include (often in a nested fashion) in order to include playbooks/roles within existing playbooks/roles. (Whether this is good or bad practice should be considered out of scope of this question, because it's not something I can immediately change. Note also that include_role is not used because these playbooks were written well before 2.2 was out, and are still in the process of being updated.)
Normally when running ansible-playbook, the output just shows each task being run, but it does not show the includes which pull in extra tasks. This makes it hard to how the overall flow jumps around between playbooks. In contrast, include_vars tasks are included in the output. I'm guessing this is because it's an Ansible module, whereas include isn't really a module.
So without having to modify the playbooks, is there way to run playbooks which shows the following?
when include directives are triggering, and
(ideally) also the exact files which are being included, since it's not always obvious how relative paths are converted into absolute paths
I've found lots of advice on various ways to debug playbooks, but nothing which achieves this. Bonus points if it also shows when roles are being included via meta role dependencies!
I'm aware that there are tools such as ansigenome which do static analysis of playbooks, but I'm hoping for something which can output the information at playbook run-time, for any playbook I choose to invoke.
If it's not currently possible, would it be a reasonable feature request?
Try executing ansible-playbook -vv, it shows "task path" for every executed task, like this:
TASK [debug] *********************************************
task path: /path/to/task/file.yml:5
ok: [localhost] => {
"msg": "aaa"
}
So you can easily track actual file (included or not) path and line number.
As for includes, there are different type of includes in current Ansible versions (2.2, 2.3): static and dynamic.
Static includes happen during parse time and information about them is printed (with -vv verbosity) at the very beginning of playbook run.
Dynamic includes happen in runtime and you can see cyan "included" lines in the output.