Ansible querying data - ansible

Ansible is primarily designed for running tasks on remote machines. I want to use it in kind of the reverse - pull data from a bunch of machines and store it locally. I love the setup module but right now I'm interested in pulling a list of users - which the setup module doesn't show me.
Also, I want to add this info into a mariadb table.
I can do this project with the shell module but, my question is, is there a better way?
Unless someone can tell me a better practice, I'll use the shell module for two parts: First, I'll simply cat /etc/passwd and register the output from that. Then, do a local_action shell script that invokes a mysql command to update the table.
I'd love to both pull the data and insert/update the data into my table using standard modules. But there don't seem to be modules that will do either.

You could use local facts. It is described here https://docs.ansible.com/ansible/latest/user_guide/playbooks_variables.html#local-facts-facts-d
If a remotely managed system has an /etc/ansible/facts.d directory, any files in this directory ending in .fact, can be JSON, INI, or executable files returning JSON, and these can supply local facts in Ansible.
then you get the local facts in json format with this line:
ansible <hostname> -m setup -a "filter=ansible_local"
For system user facts you could write a shell-script which returns the system user in json format and put this script under /etc/ansible/facts.d . Each time you gather facts from this machine, you have this facts too.

Related

How do I use ansibel-playbook to connect to mysql to execute SQL statements

I want to connect to mysql through ansible-playbook and execute SQL statements, but there is no relevant module in the official website of ansible. Does anyone know how to operate
When there does not exist an official module to do something, you basically have two choices:
Find a custom module that someone has written that will do it (such as this one).
Write a custom module that will do it.
Do it using the command module or any of the other arbitrary command modules.
In your case, for a one-off playbook, I would typically go with the third option as I don't much care for running custom modules. I would make sure (Via Ansible of course) that mysql was installed on the box you were running against, and build a mysql command line string based on what you need to do which you then run on a host with access to the DB server (or the DB server itself). That's the quickest and dirtiest method.

How can I speedup ansible by not running entire sections too often?

Assume that a normal deployment script does a lot of things and many of them are related to preparing the OS itself.
These tasks are taking a LOT of time to run even if there is nothing new to do and I want to prevent running them more often than, let's say once a day.
I know that I can use tags to filter what I am running but that's not the point: I need to make ansible aware that these "sections" executed successfully one hour ago and due to this, it would skip the entire block now.
I was expecting that caching of facts was supposed to do this but somehow I wasnt able to see any read case.
You need to figure out how to determine what "executed successfully" means. Is is just that a given playbook ran to completion? Certain roles ran to completion? Certain indicators exist that allow you determine success?
As you mention, I don't think fact caching is going to help you here unless you want to introduce custom facts into each host (http://docs.ansible.com/ansible/playbooks_variables.html#local-facts-facts-d)
I typically come up with a set of variables/facts that indicate a role has already been run. Sometimes this involves making shell calls and registering vars, looking at gathered facts and determining if certain files exist. My typical pattern for a role looks something like this
roles/my_role/main.yml
- name: load facts
include: facts.yml
- name: run config tasks if needed
include: config.yml
when: fact_var1 and fact_vars2 and inv_var1 and reg_var1
You could also dynamically write a yaml variable file that get's included in your playbooks and contains variables about the configured state of your environment. This is a more global option and doesn't really work if you need to look at the configured status of individual machines. An extension of this would be to write status variables to host_vars or group_vars inventory variable files. These would get loaded automatically on a host by host basis.
Unfortunately, as far as I know, fact caching only caches host based facts such as those created by the setup module so wouldn't allow you to use set_fact to register a fact that, for example, a role had been completed and then conditionally check for that at the start of the role.
Instead you might want to consider using Ansible with another product to orchestrate it in a more complex fashion.

Prevent usage of wrong Ansible inventory

Imagine a server setup with Ansible with a production and a reference system/cluster and a separate server running Ansible (with ssh-keys). The different clusters are identified in two inventory files.
Every playbook usage will somehow look like ansible-playbook -i production ... or ansible-playbook -i reference....
How do I prevent an accidental usage of the production inventory?
This could happen easily by either using an history entry in the shell or copy the command from some documentation.
Some ideas:
As a start every documentation is referring to the reference inventory and is also using --check.
Use two different Ansible instances and the common part is mirrored via Git. But this will result in some overhead.
Prompt once for i.e. database passwords and use different on production and reference. But not all task/tags have such a password requirement.
Actually I'm looking for something like a master password when using a specific inventory? Or a task that is always executed even if a tag is used? What is the best practice here? Do you have other ideas? Or am I somehow totally wrong here and there is a better way for my scenario?
Your production inventory could either be vaulted or even just include a vaulted file (which doesn't actually have to contain anything).
This will mean that when you attempt to run:
ansible-playbook -i production playbook.yml
It will fail with a message telling you to pass a vault password to it.
The vault password could be something simple such as "pr0duct10n" so it's entire purpose is to remind people that they are about to run it in production and so to think things through (much like how many people use a sudo password).

control ansible task file execution

My current Ansible project is setup like so:
backup-gitlab.yml
roles/
aws_backups/
tasks/
main.yml
backup-vm.yml
gitlab/
tasks/
main.yml
start.yml
stop.yml
backup-gitlab.yml needs to do the following:
Invoke stop.yml on the gitlab host.
Invoke backup-gitlab.yml on a different host.
Invoke start.yml on the gitlab host.
The problem I'm running into is Ansible doesn't seem to support a way of choosing which task files to run within the same role in the same playbook. Before I was using tags to control what Ansible would do, but in this case tagging the include statements for start.yml and stop.yml doesn't work because Ansible doesn't appear to have a way to dynamically change the applied tags that are run once they are set through the command line.
I can't come up with an elegant way to achieve this.
Some options are:
Have each task file be contained within its own role. This is annoying because I will end up with a million roles that are not grouped in any way. It's essentially abandoning the whole 'role' concept.
Use include with hard coded paths. This is prone to error as things move around. Also, since Ansible deprecated combining with_items with include (or using any sort of dynamic looping with include), I can no longer quickly change up the task files being run. Any minor change in my workflow requires lots of coding changes. I would really like to stick with using tags from the command line to control exactly what Ansible does.
Use shell scripts to invoke separate Ansible playbooks.
Use conditionals (when clause) on every single Ansible action, and control what gets run by setting variables. While several people have recommended this on SO, it sounds awful. I will have to add the conditional to hundreds of actions and every time I run a playbook the output will be cluttered by hundred's of 'skip' statements.
Leverage Jinja templates and ansible's local_connection to dynamically build static main.yml files with all the required task files included in the proper order (using computed relative paths). Then invoke that computed main.yml file. This is dangerous and convoluted.
Use top level Ansible plays to invoke lower level plays. Seems messy, also this brings in problems when I need to pass variables between plays. Using Ansible's Python Api may help this.
Ansible strives to bring VMs into idempotent states but this isn't very helpful and is a dated way of thinking in my opinion (I would have stuck with Chef if that is all I wanted). I want to leverage Ansible to actually do things such as: actively change configuration states, kick off processes, monitor events, react to events, etc. Essentially I want it to automate as much of my job as possible. The current 'role' structure (with static configurations) that Ansible recommends doesn't fit this paradigm very well even though their usage of remote command execution via SSH gets us so close to the dream.
Just use a playbook for these types of management tasks.
Granted the skip statements do somewhat clutter the output. If you wish to fix that you can further breakdown the roles into something like aws_backups-setup and aws_backups-managment.
Also the roles documentation has some information on how you can run pre_tasks and post_tasks on roles.

Formatting shell output into structured data?

Are there any means of formatting the output of shell commands to a structured data format like JSON or XML to be processed by another application?
Use case: Bunch of CentOS servers on a network. I'd like to programatically login to them via SSH, run commands to obtain system stats and eventually run basic maintenance commands. Instead of parsing all the text output myself I'm wondering if there is anything out there will help me return the data in a structured format? Even if only some shell commands were supported that would be a head start.
Sounds like the task for SNMP.
It is possible to use puppet fairly lightly. You can configure it to run it's checks only on what you want to check for.
Your entire puppet config could consist of:
exec { "yum install foo":
unless => "some-check for software",
}
That would run yum install foo but only if some-check for software failed.
That said there are lots of benefits if you're managing more that a couple of servers to getting as much of your config and build into puppet manifests (or cfengine, bcfg2, or similar) as possible.
Check out Nagios (http://www.nagios.org/) for remote system monitoring. What you are looking for may already exist out there.

Resources