How do I debug Ansible includes and dependencies?

How do I debug Ansible includes and dependencies? - debugging

I've joined a project which has a large number of playbooks and roles, and which makes heavy use of include (often in a nested fashion) in order to include playbooks/roles within existing playbooks/roles. (Whether this is good or bad practice should be considered out of scope of this question, because it's not something I can immediately change. Note also that include_role is not used because these playbooks were written well before 2.2 was out, and are still in the process of being updated.)
Normally when running ansible-playbook, the output just shows each task being run, but it does not show the includes which pull in extra tasks. This makes it hard to how the overall flow jumps around between playbooks. In contrast, include_vars tasks are included in the output. I'm guessing this is because it's an Ansible module, whereas include isn't really a module.
So without having to modify the playbooks, is there way to run playbooks which shows the following?
when include directives are triggering, and
(ideally) also the exact files which are being included, since it's not always obvious how relative paths are converted into absolute paths
I've found lots of advice on various ways to debug playbooks, but nothing which achieves this. Bonus points if it also shows when roles are being included via meta role dependencies!
I'm aware that there are tools such as ansigenome which do static analysis of playbooks, but I'm hoping for something which can output the information at playbook run-time, for any playbook I choose to invoke.
If it's not currently possible, would it be a reasonable feature request?

Try executing ansible-playbook -vv, it shows "task path" for every executed task, like this:
TASK [debug] *********************************************
task path: /path/to/task/file.yml:5
ok: [localhost] => {
"msg": "aaa"
}
So you can easily track actual file (included or not) path and line number.
As for includes, there are different type of includes in current Ansible versions (2.2, 2.3): static and dynamic.
Static includes happen during parse time and information about them is printed (with -vv verbosity) at the very beginning of playbook run.
Dynamic includes happen in runtime and you can see cyan "included" lines in the output.

Related

Upload var_files based on a value or ignore if not exist [duplicate]

What is the difference between
vars_files directive
and
include_vars module
Which should be used when, is any of above deprecated, discouraged?

Both vars_files and include_vars are labeled as stable interfaces so neither of them are deprecated. Both of them have some commonalities but they solve different purposes.
vars_files:
vars_file directive can only be used when defining a play to specify variable files. The variables from those files are included in the playbook. Since it is used in the start of the play, it most likely implies that some other play(before this play) created those vars files or they were created statically before running the configuration; means they were kind of configuration variables for the play.
include_vars:
vars_files serves one purpose of including the vars from a set of files but it cannot handle the cases if
the vars files were created dynamically and you want to include them in play
include vars in a limited scope.
You have multiple vars files and you want to include them based on certain criteria e.g. if the local database exists then include configuration of local database otherwise include configuration of a remotely hosted database.
include_vars have higher priority than vars_files so, it can be used to override default configuration(vars).
include_vars are evaluated lazily(evaluated at the time when they are used).
You want to include vars dynamically using a loop.
You want to read a file and put all those variables in a named dictionary instead of reading all variables in the global variable namespace.
You want to include all files in a directory or some subset of files in a directory(based on prefix or exclusion list) without knowing the exact names of the vars file(s).
These are some of the cases which I can think of and if you want to use any of the above cases then you need include_vars.

vars_files are read when the play starts. include_vars are read when the play reaches the task. You probably might be interested also in Variable precedence: Where should I put a variable?

Some updated information...
import_* is considered static reuse, vars_files is a type of
import.
include_* is considered dynamic reuse.
As Vladimir mentioned,
vars_files are read when the play starts. include_vars are read when the play reaches the task
Like all static items, vars_files are read before the play starts. Unlike include_vars, which are "included" when the play reaches it.
One of the biggest differences between static reuse and dynamic reuse is how the variables or tasks within them are processed. All static reuse items are processed with the linear strategy by default, all host stay in lockstep with each other. Each tasks has to complete on ALL hosts before the next task can begin. Hosts that are skipped actually get a noop task to process.
Dynamic reuse does not change the performance strategy from linear, however it does change the order the tasks are processed. With dynamic reuse, the entire group of tasks must complete on a single host before they are process by the next host. Unfortunately, all the other hosts get to twiddle their noops while they wait.
Include statements are good when you need to 'loop' a host through a series of tasks with registered outputs and do something with that information before the before the next host starts.
Import statements are good when you need to collect information or perform a task on a group of hosts before the next task can start for any host.
Here is a really good table that compares all the different Include_* and Import_* functions.Comparing includes and imports: dynamic and static re-use
Just as an FYI, here is a link to more information about performance strategies and how you can improve performance. How can I improve performance for network playbooks?

Pattern to detect Ansible failure

I have a an Ansible playbook quite big with a laot of template and it generates tons of logs (hundreds of thousands of lines in my log file)
Whenever a task fail, I can spot it with failed=
My problem is how to see where the error as of today, all I'm doing is scrolling the log and pray for my eyes to find the error but when you have that quantity of lines, it can take time and very frustrating.
Is there any pattern I should look for to find where the error is?
Thanks in advance for your inputs

By default, Ansible stops after the first failed task...
https://docs.ansible.com/ansible/latest/user_guide/playbooks_error_handling.html
Ansible normally has defaults that make sure to check the return codes
of commands and modules and it fails fast – forcing an error to be
dealt with unless you decide otherwise.
If your playbook handles a lot of targets and you want to stop everything at the first failure on any target, you an use any_errors_fatal: true play option.
https://docs.ansible.com/ansible/latest/user_guide/playbooks_error_handling.html#aborting-the-play

Ansible -- approach to hierarchical management of variables

This a common case, but it doesn't seem straight-forward in Ansible.
Let's assume of a hierarchy of groups:
linux-hosts:
application-hosts:
foobar-application-hosts:
foobar01
Now for each of these groups we want to define a set of cron jobs.
For linux-hosts, jobs that run on all linux hosts.
For application-hosts, jobs that run on only application hosts.
For foobar-applciation-hosts, jobs that run on only foobar-applcation-hosts.
The variable name is cronjobs, say, and it's a list of cron module settings.
By default, the foobar-application-hosts would clobber the setting for anything above it. Not good.
I don't see an easy way to merge (on a specific level). So I thought, all right, perhaps Ansible exposes the individual group variables for the groups a host belongs to during a run. There is groups, and there is group_names, but I don't see a groupvars corresponding to hostvars.
This seems to imply I either use to some mix-and-match of cycling over groups, dynamically importing vars (if possible), and doing the merge myself. Perhaps putting some of this in a role. But this feels like such a hack. Is there another approach?

Groups in the Ansible sense is a "tag" on hosts. Hosts can belong to more than one group. So, the conjobs var should be a list, with the same length as the number of groups that the host is in.

ansible vars_files vs include_vars

What is the difference between
vars_files directive
and
include_vars module
Which should be used when, is any of above deprecated, discouraged?

vars_files are read when the play starts. include_vars are read when the play reaches the task. You probably might be interested also in Variable precedence: Where should I put a variable?

Some updated information...
import_* is considered static reuse, vars_files is a type of
import.
include_* is considered dynamic reuse.
As Vladimir mentioned,
vars_files are read when the play starts. include_vars are read when the play reaches the task
Like all static items, vars_files are read before the play starts. Unlike include_vars, which are "included" when the play reaches it.
One of the biggest differences between static reuse and dynamic reuse is how the variables or tasks within them are processed. All static reuse items are processed with the linear strategy by default, all host stay in lockstep with each other. Each tasks has to complete on ALL hosts before the next task can begin. Hosts that are skipped actually get a noop task to process.
Dynamic reuse does not change the performance strategy from linear, however it does change the order the tasks are processed. With dynamic reuse, the entire group of tasks must complete on a single host before they are process by the next host. Unfortunately, all the other hosts get to twiddle their noops while they wait.
Include statements are good when you need to 'loop' a host through a series of tasks with registered outputs and do something with that information before the before the next host starts.
Import statements are good when you need to collect information or perform a task on a group of hosts before the next task can start for any host.
Here is a really good table that compares all the different Include_* and Import_* functions.Comparing includes and imports: dynamic and static re-use
Just as an FYI, here is a link to more information about performance strategies and how you can improve performance. How can I improve performance for network playbooks?

Gradle: task's configuration depends on another task's execution

My Gradle build has two task:
findRevision(type: SvnInfo)
buildWAR(type: MavenExec, dependsOn: findRevision)
Both tasks are configuration based, but the buildWAR task depends on a project property that is only defined in the execution phase of the findRevision task.
This breaks the process, as Gradle cannot find said property at the time it tries to configure the buildWAR task.
Is there any way to delay binding or configuration until another task has executed?
In this specific case I can make use of the mavenexec method instead of the MavenExec task type, but what should be done in similar scenarios where no alternative method exists?

Depending on what configuration option exactly you want to change, you might change it in the execution phase of the task with buildWAR.doFirst { }. But generally this is a really bad idea. If you e. g. change something that influences the result of the UP-TO-DATE checks like input files for example, the task might execute though it would not be necessary or even worse do not execute thoug it would be necessary. You can of course make the task always execute to overcome this with outputs.upToDateWhen { false }, but there might be other problems and also this way you disable one of Gradles biggest strenghts.
It is a much better idea to redesign your build so that this is not necessary. For example determining the revision at configuration time already. Depending on whether the task needs much time this might be a viable solution or not. Also depending on what you want to do with the revision, you might consider the suggestion of #LanceJava and make your findRevision task generate a file with the revision in it that is then packaged into the WAR and used at runtime.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio