Unable to get part of a play to run locally in Ansible - ansible

This is all running within AWX which is hosted on-prem. I'm trying to manage some EC2 instances within AWS. I've setup the bastion jump and can get all my other plays to work correctly.
However there is one simple job template I want to provide to a few devs. Essentially when they make a change to the code, it enables opcache to be cleared and invalidates the specific files in CloudFront.
I want the CloudFront API Call (cloudfront_invalidations module) to run off AWX locally and then if this is successful, notify the two web servers instances to restart their PHP and Apache process.
---
- name: Restart httpd and php-fpm
remote_user: ec2-user
hosts: all
become: true
tasks:
- name: Invalidate paths in CloudFront
cloudfront_invalidation:
distribution_id: "{{ distribution_id }}"
aws_access_key: "{{ aws_access_key }}"
aws_secret_key: "{{ aws_secret_key }}"
target_paths: "{{ cloudfront_invalidations.split('\n') }}"
delegate_to: 127.0.0.1
notify:
- Restart service httpd
- Restart service php-fpm
handlers:
- name: Restart service httpd
service:
name: httpd
state: restarted
- name: Restart service php-fpm
service:
name: php-fpm
state: restarted
However when running the play it seems to ignore the 'delegate_to' action and instead runs the invalidation twice, for each host. I'm unsure if it's actually running locally. I've tried adding the run_once flag, but this only then restarted httpd + PHP on one host.
Any ideas?

Related

Ansible - Execute task based on Linux service status

How can I write an Ansible playbook to start my nginx service only if firewalld.service is in stop state?
Note: Execution being done on a sandbox server. So no issues about firewalld in stop state.
You can check the service status, register a variable, and based on the variable result - start the service.
You can also use the service_facts module to get the service status.
Example:
- name: Populate service facts
service_facts:
- name: start nginx if firewalld.service is stopped
service:
name: nginx
state: started
when: ansible_facts.services['firewalld.service'].state == 'stopped'
Note: I did not tested that. Modify it accordingly
As already proposed by #Telinov, you can get facts for services and start your own service accordingly:
---
- name: Populate service facts
service_facts:
- name: start nginx if firewalld.service is stopped
service:
name: nginx
state: started
when: ansible_facts.services['firewalld.service'].state == 'stopped'
Meanwhile, ansible is about describing state. If you need firewalld stopped and nginx started, it is much easier to just say so:
- name: Make sure firewalld is stopped
service:
name: firewalld
state: stopped
- name: Make sure nginx is started
service:
name: nginx
state: started
You could be even smarter combining the above and open the firewall ports for nginx if it is running.
---
- name: Populate service facts
service_facts:
- name: Make sure port 80 and 443 are opened if firewall is running
firewalld:
port: "{{ port }}/tcp"
state: enabled
permanent: true
immediate: true
loop:
- 80
- 443
loop_control:
loop_var: port
when: ansible_facts.services['firewalld.service'].state == 'running'
- name: Make sure nginx is started
service:
name: nginx
state: started

How to run plays using Ansible AWX on all hosts in a group using localhost

I'm trying to create a playbook that creates EC2 snapshots of some AWS Windows servers. I'm having a lot of trouble understanding how to get the correct directives in place to make sure things are running as they can. What I need to do is:
run the AWS commands locally, i.e. on the AWX host (as this is preferable to having to configure credentials on every server)
run the commands against each host in the inventory
I've done this in the past with a different group of Linux servers with no issue. But the fact that I'm having these issues makes me think that's not working as I think it is (I'm pretty new to all things Ansible/AWX).
The first step is I need to identify instances that are usually turned off and turn them on, then to take snapshots, then to turn them off again if they are usually turned off. So this is my main.yml:
---
- name: start the instance if the default state is stopped
import_playbook: start-instance.yml
when: default_state is defined and default_state == 'stopped'
- name: run the snapshot script
import_playbook: instance-snapshot.yml
- name: stop the instance if the default state is stopped
import_playbook: stop-instance.yml
when: default_state is defined and default_state == 'stopped'
And this is start-instance.yml
---
- name: make sure instances are running
hosts: all
gather_facts: false
connection: local
tasks:
- name: start the instances
ec2:
instance_id: "{{ instance_id }}"
region: "{{ aws_region }}"
state: running
wait: true
register: ec2
- name: pause for 120 seconds to allow the instance to start
pause: seconds=120
When I run this, I get the following error:
fatal: [myhost,mydomain.com]: UNREACHABLE! => {
"changed": false,
"msg": "ssl: HTTPSConnectionPool(host='myhost,mydomain.com', port=5986): Max retries exceeded with url: /wsman (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f99045e6bd0>, 'Connection to myhost,mydomain.com timed out. (connect timeout=30)'))",
"unreachable": true
I've learnt enough to know that this means it is, indeed, trying to connect to each Windows host which is not what I want. However, I thought having connection: local resolved this as it seemed to with another template I have which uses and identical playbook.
So If I change start-instance.yml to instead say "connection: localhost", then the play runs - but skips the steps because it determines that no hosts meet the condition (i.e. default_state is defined and default_state == 'stopped'). This tells me that the play is running on using localhost to run the play, but is also running it against localhost instead of the list of hosts in the inventory in AWX.
My hosts have variables defined in AWX such as instance ID, region, default state. I know in normal Ansible use we would have the instance IDs in the playbook and that's how Ansible would know which AWS instances to start up, but in this case I need it to get this information from AWX.
Is there any way to do this? So, run the tasks (start the instances, pause for 120 seconds) against all hosts in my AWX inventory, using localhost?
In the end I used delegate_to for this. I changed the code to this:
---
- name: make sure instances are running
hosts: all
gather_facts: false
tasks:
- name: start the instances
delegate_to: localhost
ec2:
instance_id: "{{ instance_id }}"
region: "{{ aws_region }}"
state: running
wait: true
register: ec2
- name: pause for 120 seconds to allow the instance to start
pause: seconds=120
And AWX correctly used localhost to run the tasks I added delegation to.
Worth noting I got stuck for a while before realising my template needed two sets of credentials - one IAM user with correct permissions to run the AWS commands, and one for the machine credentials for the Windows instances.

ansible to restart network service

I copy-pasted this from the manual and it fails in my playbook (version 2.0.2):
- service: name=network state=restarted args=eth0
I am getting this error:
"msg": "Failed to stop eth0.service: Unit eth0.service not loaded.\nFailed to start eth0.service: Unit eth0.service failed to load: No such file or directory.\n"}
What is the correct syntax, please?
Just do like this (#nasr already commented it):
- name: Restart network
service:
name: network
state: restarted
But if you change network configuration before restart, something like IP address, after restart ansible hangs because connection is lost (IP address changed).
There is a way to do things right.
tasks.yml
- name: net configuration step 1
debug:
msg: we changed some files
notify: restart systemd-networkd
- name: net configuration step 2
debug:
msg: do some more work, but restart net services only once
notify: restart systemd-networkd
handlers.yml
- name: restart systemd-networkd
systemd:
name: systemd-networkd
state: restarted
async: 120
poll: 0
register: net_restarting
- name: check restart systemd-networkd status
async_status:
jid: "{{ net_restarting.ansible_job_id }}"
register: async_poll_results
until: async_poll_results.finished
retries: 30
listen: restart systemd-networkd
As per Ansible 2.7.8. You have to make following changes to restart the network.
Note: I tried this on Ubuntu 16.04
Scenario 1: Only network restart
- name: Restarting Network to take effect new IP Address
become: yes
service:
name: networking
state: restarted
Scenario 2: Flush interface and then restart network
- name: Flushing Interface
become: yes
shell: sudo ip addr flush "{{net_interface}}"
- name: Restarting Network
become: yes
service:
name: networking
state: restarted
Note: Make sure you have net_interface configured and then imported in the file where you execute this Ansible task.
OUTPUT
Please find below output that I received on my screen.
- command: /etc/init.d/network restart
does work wonderfully but I feel that using command kinda defeats the purpose of using ansible.
I m using Ubuntu 22.04.1 LTS that uses systemd instead of init
The following worked fine with me ( I tried the solutions mentioned earlier but none has worked for me)
- name: restart network
systemd:
name: NetworkManager
state: restarted

Cannot access machine after creating trough ec2 module within same script

I have problems with my playbook which should create new EC2 instances trough built-in module and connect to them to set some default stuff.
I went trough lot of tutorials/posts, but none of them mentioned same problem, therefor i'm asking there.
Everything, in terms of creating goes well, but when i have instances created, and successfully waited for SSH to come up. I got error which says machine is unreachable.
UNREACHABLE! => {"changed": false, "msg": "ERROR! SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh", "unreachable": true}
I tried to connect manually (from terminal to the same host) and i was successful (while the playbook was waiting for connection). I also tried to increase timeout generally in ansible.cfg. I verified that given hostname is valid (and it is) and also tried public ip instead of public DNS, but nothing helps.
basically my playbook looks like that
---
- name: create ec2 instances
hosts: local
connection: local
gather_facts: False
vars:
machines:
- { type: "t2.micro", instance_tags: { Name: "machine1", group: "some_group" }, security_group: ["SSH"] }
tasks:
- name: lunch new ec2 instances
local_action: ec2
group={{ item.security_group }}
instance_type={{ item.type}}
image=...
wait=true
region=...
keypair=...
count=1
instance_tags=...
with_items: machines
register: ec2
- name: wait for SSH to come up
local_action: wait_for host={{ item.instances.0.public_dns_name }} port=22 delay=60 timeout=320 state=started
with_items: ec2.results
- name: add host into launched group
add_host: name={{ item.instances.0.public_ip }} group=launched
with_items: ec2.results
- name: with the newly provisioned EC2 node configure basic stuff
hosts: launched
sudo: yes
remote_user: ubuntu
gather_facts: True
roles:
- common
Note: in many tutorials are results from creating ec2 instances accessed in different way, but thats probably for different question.
Thanks
Solved:
I don't know how, but it suddenly started to work. No clue. In case i will find some new info, will update this question
A couple points that may help:
I'm guessing it's a version difference, but I've never seen a 'results' key in the registered 'ec2' variable. In any case, I usually use 'tagged_instances' -- this ensures that even if the play didn't create an instance (ie, because a matching instance already existed from a previous run-through), the variable will still return instance data you can use to add a new host to inventory.
Try adding 'search_regex: "OpenSSH"' to your 'wait_for' play to ensure that it's not trying to run before the SSH daemon is completely up.
The modified plays would look like this:
- name: wait for SSH to come up
local_action: wait_for host={{ item.public_dns_name }} port=22 delay=60 timeout=320 state=started search_regex="OpenSSH"
with_items: ec2.tagged_instances
- name: add host into launched group
add_host: name={{ item.public_ip }} group=launched
with_items: ec2.tagged_instances
You also, of course, want to make sure that Ansible knows to use the specified key when SSH'ing to the remote host either by adding 'ansible_ssh_private_key_file' to the inventory entry or specifying '--private-key=...' on the command line.

Ansible AWS EC2 Detecting Server is Running Fails

Background:
Just trying to learn how to use Ansible and have been experimenting with the AWS Ec2 module to build and deploy a Ubuntu instance on AWS-EC2. So have built a simple Playbook to create and startup an instance and executed via ansible-playbook -vvvv ic.yml
The playbook is:
---
- name: Create a ubuntu instance on AWS
hosts: localhost
connection: local
gather_facts: False
vars:
# AWS keys for access to the API
ec2_access_key: 'secret-key'
ec2_secret_key: 'secret-key'
region: ap-southeast-2
tasks:
- name: Create a Key-Pair necessary for connection to the remote EC2 host
ec2_key:
name=ic-key region="{{region}}"
register: keypair
- name: Write the Key-Pair to a file for re-use
copy:
dest: files/ic-key.pem
content: "{{ keypair.key.private_key }}"
mode: 0600
when: keypair.changed
- name: start the instance
ec2:
ec2_access_key: "{{ec2_access_key}}"
ec2_secret_key: "{{ec2_secret_key}}"
region: ap-southeast-2
instance_type: t2.micro
image: ami-69631053
key_name: ic-key # key we just created
instance_tags: {Name: icomplain-prod, type: web, env: production} #key-values pairs for naming etc
wait: yes
register: ec2
- name: Wait for instance to start up and be running
wait_for: host = {{item.public_dns_name}} port 22 delay=60 timeout=320 state=started
with_items: ec2.instances
Problem:
The issue is that when attempting to wait for the instance to fire up, using the wait_for test, as described in Examples for EC-2 module it fails with the following error message:
msg: this module requires key=value arguments (['host', '=', 'ec2-52-64-134-61.ap-southeast-2.compute.amazonaws.com', 'port', '22', 'delay=60', 'timeout=320', 'state=started'])
FATAL: all hosts have already failed -- aborting
Output:
Although the error message appears on the command line when I check in the AWS-Console the Key-Pair and EC2 instance are created and running.
Query:
Wondering
There is some other parameter which I need ?
What is the 'key=value' msg which is the error output being caused by?
Any recommendations on other ways to debug the script to determine the cause of the failure ?
Does it require registration of the host somewhere in the Ansible world ?
Additional NOTES:
Testing the playbook I've observed that the key-pair gets created, the server startup is initiated at AWS as seen from the AWS web console. What appears to be the issue is that the time period of the server to spin up is too long and the script timeouts or fails. Frustratingly, is that the error message is not all that helpful and also wondering if there is any other methods of debugging an ansible script ?
this isn't a problem of "detecting the server is running". As the error message says, it's a problem with syntax.
# bad
wait_for: host = {{item.public_dns_name}} port 22 delay=60 timeout=320 state=started
# good
wait_for: host={{item.public_dns_name}} port=22 delay=60 timeout=320 state=started
Additionally, you'll want to run this from the central machine, not the remote (new) server.
local_action: wait_for host={{item.public_dns_name}} port=22 delay=60 timeout=320 state=started
Focusing on the wait_for test as you indicate that the rest is working.
Based on the jobs I have running I would think the issue is with the host name, not with the rest of the code. I use an Ansible server in a protected VPC that has network access to the VPC where the servers start up in, and my wait_for code looks like this (variable name updated to match yours):
- name: wait for instances to listen on port 22
wait_for:
delay: 10
state: started
host: "{{ item.private_ip }}"
port: 22
timeout: 300
with_items: ec2.instances
Trying to use DNS instead of an IP address has always proven to be unreliable for me - if I'm registering DNS as part of a job, it can sometimes take a minute to be resolvable (sometimes instant, sometimes not). Using the IP addresses works every time of course - as long as the networking is set up correctly.
If your Ansible server is in a different region or has to use the external IP to access the new servers, you will of course need to have the relevant security groups and add the new server(s) to those before you can use wait_for.

Resources