Problem
I'm configuring apache vhosts with ansible. It takes a lot of time to create the configs.
Version info:
ansible-playbook 2.7.10
Apache/2.4.29
All vhosts are on the same server.
I'm using the file system structure that Apache suggests:
One file per site (vhost and used port) which will be saved to sites-available/443_example.com.conf. Then I generate a symlink to sites-enabled.
Sequentially running the tasks for the configuration takes about 5 minutes for 34 vhosts if there are no changes.
I created a role for the apache configuration. I identified 13 tasks that have to be run for every vhost:
openssl_privatekey
openssl_csr
openssl_certificate
6 tasks: place the files on the correct places in the file system but only if there is no Let's Encrypt configuration present
create template
enable template
2 tasks: delete old config
For example writing the config template, generate a self signed certificate.
It would be nice if I can parallelize these vhosts. I grouped the tasks I want to run in parallel in the file parallel.yml. The content of this file has to be processed in the correct order.
Solutions
These are the solutions I tried but none of them worked:
1) use async on every task:
I'm using a template in parallel.yml and this cannot be asynced.
2) use one async while including tasks:
- name: 'configure vhost'
include_tasks: parallel.yml
loop: "{{ vhosts }}"
async: 60
There is a know issue that makes Ansible ignore the the async. The loop items are processed serial https://github.com/ansible/ansible/issues/22716
3) use "delegate to" to create more forks of Ansible:
- name: 'configure vhost'
include_tasks: parallel.yml
loop: "{{ vhosts }}"
delegate_to: ansible#example.com
The loop items are processed serial https://github.com/ansible/ansible/issues/37995
4) use strategy: free
"A second strategy ships with Ansible - free - which allows each host to run until the end of the play as fast as it can." https://docs.ansible.com/ansible/latest/user_guide/playbooks_strategies.html
strategy: free allows multiple hosts to be run in parallel. That doesn't work on one host only.
5) increase forks
"Since Ansible 1.3, the fork number is automatically limited to the number of possible hosts at runtime," https://docs.ansible.com/ansible/2.4/intro_configuration.html#forks
Same problem as before.
Summary
How can I run the tasks in parallel?
How can I improve the performance?
Related
I have 80+ hosts that run my application, and I'm updating a long-existing ansible playbook to change our load balancer. In our current load balancer setup, hosts can be added/removed from the load balancer in one ansible play by shelling out to the AWS CLI. However, we're switching to a load balancer configured on a handful of our own hosts, and we will take hosts in and out by manipulating text files on those hosts using Ansible. I need an inner loop over different hosts within a playbook while using Serial.
I'm having trouble structuring the playbook such that I can fan out blockinfile commands to hosts in group tag_Type_edge while deploying to the 80 tag_Type_app hosts with serial: 25%.
Here's what I want to be able to do:
---
- hosts: tag_Type_app
serial: "25%"
pre_tasks:
- name: Gathering ec2 facts
action: ec2_metadata_facts
- name: Remove from load balancers
debug:
msg: "This is where I'd fan out to multiple different hosts from group tag_Type_edge to manipulate
text files to remove the 25% of hosts from tag_Type_app from the load balancer"
tasks:
- name: Do a bunch of work to upgrade the app on the tag_Type_app machines while out of the load balancer
debug:
msg: "deploy new code, restart service"
post_tasks:
- name: Put back in load balancer
debug:
msg: "This is where I'd fan out to multiple different hosts from group tag_Type_edge to manipulate
text files to *add* the 25% of hosts from tag_Type_app back into the load balancer"
How can I structure this to allow for the inner loop over tag_Type_edge while using serial: 25% on all the tag_Type_app boxes?
If I may say so, yuk.
On the ACA project, each host had a file called /etc/ansible/facts.d/load_balancer_state.fact. That was just an INI file that set lb_state to enabled, disabled, or stopped.
We then ran the setup module (or gather_facts: yes), to get the state of each host, and ran the template module to create the load balancer config file. Very clean. Very simple.
To change a state, change the file and re-run the template module.
This was dynamic inventory.
If you have static inventory, it's even easier. Just set lb_state set on each host, either in an INI file like host1 lb_state=enabled, or in files in the host_vars directory. Change the inventory, re-run the template module, and (if necessary) tell the loadbalancer to reload the config file.
I want to build a docker image locally and deploy it so it can then be pulled on the remote server I'm deploying to. To do this I first need to check out code from git to be built.
I have an existing role which installs git, sets up keys for reading from our repo etc. I want to run this role locally to check out the code I care about.
I looked at local action, delegate_to, etc but haven't figured out an easy way to do this. The best approach I could find was:
- name: check out project from git
delegate_to: localhost
include_role:
name: configure_git
However, this doesn't work I get a complaint that there is a syntax error on the name line. If I remove the delegate_to line it works (but runs on the wrong server). If I replace include_role with debug it will run locally. It's almost as if ansible explicitly refuses to run an included role locally, not that I can find that anywhere in the documentation.
Is there a clean way to run this, or other roles, locally?
Extract from the include_role module documentation
Task-level keywords, loops, and conditionals apply only to the include_role statement itself.
To apply keywords to the tasks within the role, pass them using the apply option or use ansible.builtin.import_role instead.
Ignores some keywords, like until and retries.
I actually don't know if the error you get is linked to delegate_to being ignored (I seriously doubt it is the case...). Meanwhile it's not the correct way to use it here:
- name: check out project from git
include_role:
name: configure_git
apply:
delegate_to: localhost
Moreover, this is most probably a bad idea. Let's imagine your play targets 100 servers: the role will run one hundred time (unless you also apply run_once: true). I would run my role "normally" on localhost in a dedicated play then do the rest of the job on my targets in the next one(s).
- name: Prepare env on localhost
hosts: localhost
roles:
- role: configure_git
- name: Do the rest on other hosts
hosts: my_group
tasks:
- name: dummy.
debug:
msg: "Dummy"
The real scenario, want to get a resource id of sqs in AWS, which will be returned after the execution of a playbook. So, using this variable in files to configure the application.
Persisting variables from one playbook to another
checking out the documentation, modules like set_fact and register have scope only for that specific host. There are many purpose of using the variables from one host to another.
Alternatives I can think of:
using Command module and echoing the variables to a file. Later, using the variable file using vars section or include.
Setting the env variables and then accessing it but this will be difficult.
So what is the solution?
If you're gathering facts, you can access hostvars via the normal jinja2 + variable lookup:
e.g.
- hosts: serverA.example.org
gather_facts: True
...
tasks:
- set_fact:
taco_tuesday: False
and then, if this has run, on another host:
- hosts: serverB.example.org
...
tasks:
- debug: var="{{ hostvars['serverA.example.org']['ansible_memtotal_mb'] }}"
- debug: var="{{ hostvars['serverA.example.org']['taco_tuesday'] }}"
Keep in mind that if you have multiple Ansible control machines (where you call ansible and ansible-playbook from), you should take advantage of the fact that Ansible can store its facts/variables in a cache (currently Redis and json), that way the control machines are less likely to have different hostvars. With this, you could set your control machines to use a file in a shared folder (which has its risks -- what if two control machines are running on the same host at the same time?), or set/get facts from a Redis server.
For my uses of Amazon data, I prefer to just fetch the resource each time using a tag/metadata lookup. I wrote an Ansible plugin that allows me to do this a little more easily as I prefer this to thinking about hostvars and run ordering (but your mileage may vary).
You can pass variables On The Command Line: http://docs.ansible.com/ansible/playbooks_variables.html#passing-variables-on-the-command-line
ansible-playbook release.yml --extra-vars "version=1.23.45 other_variable=foo"
You can use local connection to run playbook = get variable and apply it to another playbook:
- hosts: 127.0.0.1
connection: local
- shell: ansible-playbook -i ...
register: sqs_id
- shell: ansible-playbook -i ... -e "sqs_id={{sqs_id.stdout}}"
Also delegation might be useful in this scenario:
http://docs.ansible.com/ansible/playbooks_delegation.html#delegation
Also you can store output in the local file and use (http://docs.ansible.com/ansible/playbooks_delegation.html#delegation):
- name: take a sqs id
local_action: command cat ~/sqs_id
PS:
I don't understand why you can't write complex playbook where will be included many roles that will share variables?
You can write "common" variables to a host_vars or group_vars this way all the servers has access to it.
Another way may be to create a custom ansible module/lookup plugin to hide all the boilerplate code and get an easy and flexible access to the variables you need.
I had a similar issue with azure DevOps pipelines.
I created VM:s with terraform, ssh-keys and windows username/password was generated by terraform and stored it in a KeyVault.
So I then needed to query KeyVault before running Ansible on all created VM:s. I ended up using Azure python SDK to get all secrets. I also generate an inventory file and a host_vars folder with a file for each VM.
The actual play-book is now very basic and does the job perfectly. All variables for terraform and ansible is in a json file. And the python script is less than 30 lines.
I have a script that setups all the servers. Now trying to figure out a good way to configure them to talk to each other. e.g. Configure the application server to talk to a particular database server.
Test app 1
db01
app01
app02
mem01
Test app 2
db02
app03
mem02
The only thing I could come up with a role that takes the servers as params but I dislike that I have to also specify the hosts twice.
- name: Test app 1
hosts: [db01, app01, app02, mem01]
roles:
- {role: app, db: db01, ap: [app01, app02], mem: mem01}
How organized is your inventory file?
Looking at what you posted, this might be a good inventory file organization for you:
[testapp1-dbServers]
db01
[testapp1-appServers]
app01
app02
[testapp1-memServers]
mem01
[testapp2-dbServers]
db02
[testapp2-appServers]
app03
[testapp2-memServers]
mem02
[testapp1:children]
testapp1-dbServers
testapp1-appServers
testapp1-memServers
[testapp2:children]
testapp2-dbServers
testapp2-appServers
testapp2-memServers
[dbServers:children]
testapp1-dbServers
testapp2-dbServers
[appServers:children]
testapp1-appServers
testapp2-appServers
[memServers:children]
testapp1-memServers
testapp2-memServers
This might be overkill if you have no plans to increase the number of servers in any of the first 6 buckets, but it allows you to do things like group_vars files (individual ones for some or all groupings - testapp1, testapp2, dbServers, etc) and clean up your playbook file:
- name: Test app 1
hosts: testapp1 (all vars passed via group_vars file)
roles:
- generic_server_setup
- name: DB Server setup
hosts: dbServers (all vars passed via group_vars file)
roles:
- install_postgres
- other_db_things
The final thing that will help you the most can be found here.
Specifically, getting access to all the groups the current host is in and all the hosts in a group.
QUICK FIX: If you want to sacrifice some organization because you aren't worried about scaling and are not annoyed by the same information being in multiple locations, just add the relevant hosts as vars to testapp1 and testapp2 files under group_vars.
I am using ansible to script a deployment for an API. I would like this to work sequentially through each host in my inventory file so that I can fully deploy to one machine at a time.
With the out box behaviour, each task in my playbook is executed for each host in the inventory file before moving on to the next task.
How can I change this behaviour to execute all tasks for a host before starting on the next host? Ideally I would like to only have one playbook.
Thanks
Have a closer look at Rolling Updates:
What you are searching for is
- hosts: webservers
serial: 1
tasks:
- name: ...
Using the --forks=1 specify number of parallel processes to use (default=5)
Strategy enable to parallel tasks in a per host basis. See https://docs.ansible.com/ansible/latest/user_guide/playbooks_strategies.html
There are 3 strategies: linear (the default), serial and free (quickest)
- hosts: all
strategy: free
tasks:
...