Ansible: Get number of hosts in group - ansible

I'm trying to get the number of hosts of a certain group.
Imagine an inventory file like this:
[maingroup]
server-[01:05]
Now in my playbook I would like to get the number of hosts that are part of maingroup which would be 5 in this case and store that in a variable which is supposed to be used in a template in one of the playbook's tasks.
At the moment I'm setting the variable manually which is far from ideal..
vars:
HOST_COUNT: 5

vars:
HOST_COUNT: "{{ groups['maingroup'] | length }}"

Also without explicit group name:
vars:
HOST_COUNT: "{{ ansible_play_hosts | length }}"

Related

Ansible: How to find aggregated file size across inventory hosts?

I'm able to find the total size of all the three files in variable totalsize on a single host as shown below.
cat all.hosts
[destnode]
myhost1
myhost2
myhost3
cat myplay.yml
- name: "Play 1"
hosts: "destnode"
gather_facts: false
tasks:
- name: Fail if file size is greater than 2GB
include_tasks: "{{ playbook_dir }}/checkfilesize.yml"
with_items:
- "{{ source_file_new.splitlines() }}"
cat checkfilesize.yml
- name: Check file size
stat:
path: "{{ item }}"
register: file_size
- set_fact:
totalsize: "{{ totalsize | default(0) |int + ( file_size.stat.size / 1024 / 1024 ) | int }}"
- debug:
msg: "TOTALSIZE: {{ totalsize }}"
To run:
ansible-playbook -i all.hosts myplay.yml -e source_file_new="/tmp/file1.log\n/tmp/file1.log\n/tmp/file1.log"
The above play works fine and gets me the total sum of sizes of all the files mentioned in variable source_file_new on individual hosts.
My requirement is to get the total size of all the files from all the three(or more) hosts mention is destnode group.
So, if each file is 10 MB on each host, the current playbook prints 10+10+10=30MB on host1 and like wise on host2 and host3.
Instead, I wish to the the sum of all the sizes from all the hosts like below
host1 (10+10+10) + host2 (10+10+10) + host3 (10+10+10) = 90MB
Extract the totalsize facts for each node in destnode from hostvars and sum them up.
In a nutshell, at the end of your current checkfilesize.yml task file, replace the debug task:
- name: Show total size for all nodes
vars:
overall_size: "{{ groups['destnode'] | map('extract', hostvars, 'totalsize')
| map('int') | sum }}"
debug:
msg: "Total size for all nodes: {{ overall_size }}"
run_once: true
If you need to reuse that value later, you can store it at once in a fact that will be set with the same value for all hosts:
- name: Set overall size as fact for all hosts
set_fact:
overall_size: "{{ groups['destnode'] | map('extract', hostvars, 'totalsize')
| map('int') | sum }}"
run_once: true
- name: Show the overall size (on result with same value for each host)
debug:
msg: "Total size for all nodes: {{ overall_size }} - (from {{ inventory_hostname }})"
As an alternative, you can replace set_fact with a variable declaration at play level.
It seems you are trying to implement (distributed) programming paradigms which aren't plain possible, at least not in that way and since Ansible is not a programming language or something for distributed computing but a Configuration Management Tool in which you declare a state. Therefore those are not recommended and should probably avoided.
Since your use case looks for me like in a normal MapReduce environment I understand from your description that you like to implement a kind of Reducer in a Distributed Environment in Ansible.
You made already the observation that the facts are distributed over your hosts in your environment. To sum them up it will be necessary that they become aggregated on one of the hosts, probably the Control Node.
To do so:
It might be possible to use Delegating facts for your task set_fact to get all necessary information to sum up onto one host
An other approach could be to let your task creating and adding custom facts about the summed up filesize during run. Those Custom Facts could become gathered and cached on the Control Node during next run.
A third option and since Custom Facts can be simple files, one could probably create a simple cronjob which creates the necessary .fact file with requested information (filesize, etc.) on a scheduled base.
Further Documentation
facts.d or local facts
Introduction to Ansible facts
Similar Q&A
Ansible: How to define ... a global ... variable?
Summary
My requirement is to get the total size of all the files from all the three (or more) hosts ...
Instead of creating a playbook which is generating and calculating values (facts) during execution time it is recommended to define something for the Target Nodes and create a playbook which is just collecting the facts in question.
In example
... add dynamic facts by adding executable scripts to facts.d. For example, you can add a list of all users on a host to your facts by creating and running a script in facts.d.
which can also be about files and the size.

Ansible - Is it possible to loop over a list of objects in input within a playbook

I am trying to create a playbook which is managing to create some load balancers.
The playbook takes a configuration YAML in input, which is formatted like so:
-----configuration.yml-----
virtual_servers:
- name: "test-1.local"
type: "standard"
vs_port: 443
description: ""
monitor_interval: 30
ssl_flag: true
(omissis)
As you can see, this defines a list of load balancing objects with the relative specifications.
If I want to create for example a monitor instance, which depends on these definitions, I created this task which is defined within a playbook.
-----Playbook snippet-----
...
- name: "Creator | Create new monitor"
include_role:
name: vs-creator
tasks_from: pool_creator
with_items: "{{ virtual_servers }}"
loop_control:
loop_var: monitor_item
...
-----Monitor Task-----
- name: "Set monitor facts - Site 1"
set_fact:
monitor_name: "{{ monitor_item.name }}"
monitor_vs_port: "{{ monitor_item.vs_port }}"
monitor_interval: "{{ monitor_item.monitor_interval}}"
monitor_partition: "{{ hostvars['localhost']['vlan_partition'] | first }}"
...
(omissis)
- name: "Create HTTP monitor - Site 1"
bigip_monitor_http:
state: present
name: "{{ monitor_name }}_{{ monitor_vs_port }}.monitor"
partition: "{{ monitor_partition }}"
interval: "{{ monitor_interval }}"
timeout: "{{ monitor_interval | int * 3 | int + 1 | int }}"
provider:
server: "{{ inventory_hostname}}"
user: "{{ username }}"
password: "{{ password }}"
delegate_to: localhost
when:
- site: 1
- monitor_item.name | regex_search(regex_site_1) != None
...
As you can probably already see, I have a few problems with this code, the main one which I would like to optimize is the following:
The creation of a load balancer (virtual_server) involves multiple tasks (creation of a monitor, pool, etc...), and I would need to treat each list element in the configuration like an object to create, with all the necessary definitions.
I would need to do this for different sites which pertain to our datacenters - for which I use regex_site_1 and site: 1 in order to get the correct one... though I realize that this is not ideal.
The script, as of now, does that, but it's not well-managed I believe, and I'm at a loss on what approach should I take in developing this playbook: I was thinking about looping over the playbook with each element from the configuration list, but apparently, this is not possible, and I'm wondering if there's any way to do this, if possible with an example.
Thanks in advance for any input you might have.
If you can influence input data I advise to turn elements of virtual_servers into hosts.
In this case inventory will look like this:
virtual_servers:
hosts:
test-1.local:
vs_port: 443
description: ""
monitor_interval: 30
ssl_flag: true
And all code code will become a bliss:
- hosts: virtual_servers
tasks:
- name: Doo something
delegate_to: other_host
debug: msg=done
...
Ansible will create all loops for you for free (no need for include_roles or odd loops), and most of things with variables will be very easy. Each host has own set of variable which you just ... use.
And part where 'we are doing configuration on a real host, not this virtual' is done by use of delegate_to.
This is idiomatic Ansible and it's better to follow this way. Every time you have include_role within loop, you for sure made a mistake in designing the inventory.

Run ansible task only once per each unique fact value

I have a dynamic inventory that assigns a "fact" to each host, called a 'cluster_number'.
The cluster numbers are not known in advance, but there is one or more hosts that are assigned the same number. The inventory has hundreds of hosts and 2-3 dozen unique cluster numbers.
I want to run a task for all hosts in the inventory, however I want to execute it only once per each group of hosts sharing the same 'cluster_number' value. It does not matter which specific host is selected for each group.
I feel like there should be a relatively straight forward way to do this with ansible, but can't figure out how. I've looked at group_by, when, loop, delegate_to etc. But no success yet.
An option would be to
group_by the cluster_number
run_once a loop over cluster numbers
and pick the first host from each group.
For example given the hosts
[test]
test01 cluster_number='1'
test02 cluster_number='1'
test03 cluster_number='1'
test04 cluster_number='1'
test05 cluster_number='1'
test06 cluster_number='2'
test07 cluster_number='2'
test08 cluster_number='2'
test09 cluster_number='3'
test10 cluster_number='3'
[test:vars]
cluster_numbers=['1','2','3']
the following playbook
- hosts: all
gather_facts: no
tasks:
- group_by: key=cluster_{{ cluster_number }}
- debug: var=groups['cluster_{{ item }}'][0]
loop: "{{ cluster_numbers }}"
run_once: true
gives
> ansible-playbook test.yml | grep groups
"groups['cluster_1'][0]": "test01",
"groups['cluster_2'][0]": "test06",
"groups['cluster_3'][0]": "test09",
To execute tasks at the targets include_tasks (instead of debug in the loop above) and delegate_to the target
- set_fact:
my_group: "cluster_{{ item }}"
- command: hostname
delegate_to: "{{ groups[my_group][0] }}"
Note: Collect the list cluster_numbers from the inventory
cluster_numbers: "{{ hostvars|json_query('*.cluster_number')|unique }}"
If you don't mind play logs cluttering, here's a way:
- hosts: all
gather_facts: no
serial: 1
tasks:
- group_by:
key: "single_{{ cluster_number }}"
when: groups['single_'+cluster_number] | default([]) | count == 0
- hosts: single_*
gather_facts: no
tasks:
- debug:
msg: "{{ inventory_hostname }}"
serial: 1 is crucial in the first play to reevaluate when statement on for every host.
After first play you'll have N groups for each cluster with only single host in them.

Ansible first hostname of groups

I am not sure on how to find the first ansible hostname from group_names. Could you kindly advise me on how to do it?
hosts
[webservers]
server1
server2
server3
[webserver-el7]
server4
server5
server6
And i have 2 different playbook for each host groups
playbook1.yml
- name: deploy app
hosts: webservers
serial: 8
roles:
- roles1
playbook2.yml
- name: deploy app
hosts: webservers-el7
serial: 8
roles:
- roles1
the problem is that i have delegate task to first host of each group. previously i only used webservers group, so it was much easier by using the task below
- name: syncing web files to {{ version_dir }}
synchronize:
src: "{{ build_dir }}"
dest: "{{ version_dir }}"
rsync_timeout: 60
delegate_to: "{{ groups.webservers | first }}"
If i have 2 different group_names, how can i select the first one of each group? so it can be more dynamic
If you want the first host of current play to be a kind of master host to sync from, I'd recommend another approach: use one of play_hosts or ansible_play_hosts (depending on your Ansible version) variables. See magic variables.
Like delegate_to: "{{ play_hosts | first }}".
The thing is when you say hosts: webservers-el7 to Ansible webservers-el7 is a pattern here. Ansible search for hosts to match this pattern and feed them into Play. You may have written webservers-el* as well. So inside Play you don't have any variable that will tell you "I'm running this Play on hosts from group webserver-el7...". You may only make some guess work analyzing group_names and groups magic variables. But this become clumsy when you have one host in several groups.
For hosts in single group only, you may try: groups[group_names | first] | first
To get any element from group, use group[group_name][0...n].
This will get the first element from the group.
- debug: msg="{{ groups['group_name'][0] }}"

Registering Each Host Specific Value from Dictionary

We wanted to have a single playbook for all the deployments and the multiple hosts will be looped in. Ansible calls will be made from Jenkins pipeline by passing in the environments, for example dev6 and dev8
env1=dev6
env2=dev8
Pipeline Call:
ansible-playbook -i hosts --limit $env1:$env2 deploy_test.yml -e "env1={{$env1}} env2={{$env2}}"
I defined all the host specific variables (dev1,dev2......PERF8 etc.) in single file so it is easy to manage and maintain,
dev6:
- { deploy_domain: "Dev6Domain",
WL_Admin: "DEV6WLAdmin",
WL_Managed: "DEV6Managed" }
dev7:
- { deploy_domain: "Dev7Domain",
WL_Admin: "Dev7WLAdmin",
WL_Managed: "Dev7Managed" }
Playbook "Deploy_test.yml"
- hosts: all
vars_files:
- host_variables.yml
tasks:
- debug: msg='Target Domain is "{{ item[0].deploy_domain }}"'
with_nested:
- "{{ env1 }}"
- "{{ env2 }}"
The env1 and env2 values are being read from jenkins, no issues there
Problem-1: When the playbook runs on dev6 first, it takes dev8 values as well since it is defined under with_nested items.
Problem-2: How do I register the values specific to every environment?
for example, down the playbook when I say, mkdir /tmp/{{deploy_domain}, I need seperate values for dev6 and dev8.
Here is an example how you can read name-specific variable for every host:
hosts:
[dev6]
box1
[dev8]
box2
host_variables.yml:
dev6:
deploy_domain: "Dev6Domain"
WL_Admin: "DEV6WLAdmin"
WL_Managed: "DEV6Managed"
dev8:
deploy_domain: "Dev8Domain"
WL_Admin: "Dev8WLAdmin"
WL_Managed: "Dev8Managed"
I stripped out list level from original host_variables.yml, because it is not necessary in this case, there is always single element in the list.
deploy_test.yml:
- hosts: all
tasks:
- include_vars: host_variables.yml
- set_fact:
my_env: "{{ hostvars[inventory_hostname][group_names[0]] }}"
- debug: msg="My domain = {{ my_env.deploy_domain }}"
execution: ansible-playbook -i hosts --limit $env1:$env2 deploy_test.yml
This will execute deploy_test.yml for all hosts in groups set in env vars env1 and env2.
In the begining of playbook, we load everything from host_variables.yml as host facts.
And with set_fact extract variable named after current host's group name as my_env.
So box1 will have dev6 as my_env and box2 will have dev8.

Resources