Ansible: Trigger the task only when previous task is successful and the output is created - ansible

I am deploying a VM in azure using ansible and using the public ip created in the next tasks. But the time taken to create the public ip is too long so when the subsequent task is executed, it fails. The time to create the ip also varies, it's not fixed. I want to introduce some logic where the next task will only run when the ip is created.
- name: Deploy Master Node
azure_rm_virtualmachine:
resource_group: myResourceGroup
name: testvm10
admin_username: chouseknecht
admin_password: <your password here>
image:
offer: CentOS-CI
publisher: OpenLogic
sku: '7-CI'
version: latest
Can someone assist me here..! It's greatly appreciated.

I think the wait_for module is a bad choice because while it can test for port availability it will often give you false positives because the port is open before the service is actually ready to accept connections.
Fortunately, the wait_for_connection module was designed for exactly the situation you are describing: it will wait until Ansible is able to successfully connect to your target.
This generally requires that you register your Azure VM with your Ansible inventory (e.g. using the add_host module). I don't use Azure, but if I were doing this with OpenStack I might write something like this:
- hosts: localhost
gather_facts: false
tasks:
# This is the task that creates the vm, much like your existing task
- os_server:
name: larstest
cloud: kaizen-lars
image: 669142a3-fbda-4a83-bce8-e09371660e2c
key_name: default
flavor: m1.small
security_groups: allow_ssh
nics:
- net-name: test_net0
auto_ip: true
register: myserver
# Now we take the public ip from the previous task and use it
# to create a new inventory entry for a host named "myserver".
- add_host:
name: myserver
ansible_host: "{{ myserver.openstack.accessIPv4 }}"
ansible_user: centos
# Now we wait for the host to finished booting. We need gather_facts: false here
# because otherwise Ansible will attempt to run the `setup` module on the target,
# which will fail if the host isn't ready yet.
- hosts: myserver
gather_facts: false
tasks:
- wait_for_connection:
delay: 10
# We could add additional tasks to the previous play, but we can also start
# new play with implicit fact gathering.
- hosts: myserver
tasks:
- ...other tasks here...

Related

How to wait for ssh to become available on a host before installing a role?

Is there a way to wait for ssh to become available on a host before installing a role? There's wait_for_connection but I only figured out how to use it with tasks.
This particular playbook spin up servers on a cloud provider before attempting to install roles. But fails since the ssh service on the hosts isn't available yet.
How should I fix this?
---
- hosts: localhost
connection: local
tasks:
- name: Deploy vultr servers
include_tasks: create_vultr_server.yml
loop: "{{ groups['vultr_servers'] }}"
- hosts: all
gather_facts: no
become: true
tasks:
- name: wait_for_connection # This one works
wait_for_connection:
delay: 5
timeout: 600
- name: Gather facts for first time
setup:
- name: Install curl
package:
name: "curl"
state: present
roles: # How to NOT install roles UNLESS the current host is available ?
- role: apache2
vars:
doc_root: /var/www/example
message: 'Hello world!'
- common-tools
Ansible play actions start with pre_tasks, then roles, followed by tasks and finally post_tasks. Move your wait_for_connection task as the first pre_tasks and it will block everything until connection is available:
- hosts: all
gather_facts: no
become: true
pre_tasks:
- name: wait_for_connection # This one works
wait_for_connection:
delay: 5
timeout: 600
roles: ...
tasks: ...
For more info on execution order, see this title in role's documentation (paragraph just above the notes).
Note: you probably want to move all your current example tasks in that section too so that facts are gathered and curl installed prior to do anything else.

How to run plays using Ansible AWX on all hosts in a group using localhost

I'm trying to create a playbook that creates EC2 snapshots of some AWS Windows servers. I'm having a lot of trouble understanding how to get the correct directives in place to make sure things are running as they can. What I need to do is:
run the AWS commands locally, i.e. on the AWX host (as this is preferable to having to configure credentials on every server)
run the commands against each host in the inventory
I've done this in the past with a different group of Linux servers with no issue. But the fact that I'm having these issues makes me think that's not working as I think it is (I'm pretty new to all things Ansible/AWX).
The first step is I need to identify instances that are usually turned off and turn them on, then to take snapshots, then to turn them off again if they are usually turned off. So this is my main.yml:
---
- name: start the instance if the default state is stopped
import_playbook: start-instance.yml
when: default_state is defined and default_state == 'stopped'
- name: run the snapshot script
import_playbook: instance-snapshot.yml
- name: stop the instance if the default state is stopped
import_playbook: stop-instance.yml
when: default_state is defined and default_state == 'stopped'
And this is start-instance.yml
---
- name: make sure instances are running
hosts: all
gather_facts: false
connection: local
tasks:
- name: start the instances
ec2:
instance_id: "{{ instance_id }}"
region: "{{ aws_region }}"
state: running
wait: true
register: ec2
- name: pause for 120 seconds to allow the instance to start
pause: seconds=120
When I run this, I get the following error:
fatal: [myhost,mydomain.com]: UNREACHABLE! => {
"changed": false,
"msg": "ssl: HTTPSConnectionPool(host='myhost,mydomain.com', port=5986): Max retries exceeded with url: /wsman (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f99045e6bd0>, 'Connection to myhost,mydomain.com timed out. (connect timeout=30)'))",
"unreachable": true
I've learnt enough to know that this means it is, indeed, trying to connect to each Windows host which is not what I want. However, I thought having connection: local resolved this as it seemed to with another template I have which uses and identical playbook.
So If I change start-instance.yml to instead say "connection: localhost", then the play runs - but skips the steps because it determines that no hosts meet the condition (i.e. default_state is defined and default_state == 'stopped'). This tells me that the play is running on using localhost to run the play, but is also running it against localhost instead of the list of hosts in the inventory in AWX.
My hosts have variables defined in AWX such as instance ID, region, default state. I know in normal Ansible use we would have the instance IDs in the playbook and that's how Ansible would know which AWS instances to start up, but in this case I need it to get this information from AWX.
Is there any way to do this? So, run the tasks (start the instances, pause for 120 seconds) against all hosts in my AWX inventory, using localhost?
In the end I used delegate_to for this. I changed the code to this:
---
- name: make sure instances are running
hosts: all
gather_facts: false
tasks:
- name: start the instances
delegate_to: localhost
ec2:
instance_id: "{{ instance_id }}"
region: "{{ aws_region }}"
state: running
wait: true
register: ec2
- name: pause for 120 seconds to allow the instance to start
pause: seconds=120
And AWX correctly used localhost to run the tasks I added delegation to.
Worth noting I got stuck for a while before realising my template needed two sets of credentials - one IAM user with correct permissions to run the AWS commands, and one for the machine credentials for the Windows instances.

Ansible AWS EC2 Detecting Server is Running Fails

Background:
Just trying to learn how to use Ansible and have been experimenting with the AWS Ec2 module to build and deploy a Ubuntu instance on AWS-EC2. So have built a simple Playbook to create and startup an instance and executed via ansible-playbook -vvvv ic.yml
The playbook is:
---
- name: Create a ubuntu instance on AWS
hosts: localhost
connection: local
gather_facts: False
vars:
# AWS keys for access to the API
ec2_access_key: 'secret-key'
ec2_secret_key: 'secret-key'
region: ap-southeast-2
tasks:
- name: Create a Key-Pair necessary for connection to the remote EC2 host
ec2_key:
name=ic-key region="{{region}}"
register: keypair
- name: Write the Key-Pair to a file for re-use
copy:
dest: files/ic-key.pem
content: "{{ keypair.key.private_key }}"
mode: 0600
when: keypair.changed
- name: start the instance
ec2:
ec2_access_key: "{{ec2_access_key}}"
ec2_secret_key: "{{ec2_secret_key}}"
region: ap-southeast-2
instance_type: t2.micro
image: ami-69631053
key_name: ic-key # key we just created
instance_tags: {Name: icomplain-prod, type: web, env: production} #key-values pairs for naming etc
wait: yes
register: ec2
- name: Wait for instance to start up and be running
wait_for: host = {{item.public_dns_name}} port 22 delay=60 timeout=320 state=started
with_items: ec2.instances
Problem:
The issue is that when attempting to wait for the instance to fire up, using the wait_for test, as described in Examples for EC-2 module it fails with the following error message:
msg: this module requires key=value arguments (['host', '=', 'ec2-52-64-134-61.ap-southeast-2.compute.amazonaws.com', 'port', '22', 'delay=60', 'timeout=320', 'state=started'])
FATAL: all hosts have already failed -- aborting
Output:
Although the error message appears on the command line when I check in the AWS-Console the Key-Pair and EC2 instance are created and running.
Query:
Wondering
There is some other parameter which I need ?
What is the 'key=value' msg which is the error output being caused by?
Any recommendations on other ways to debug the script to determine the cause of the failure ?
Does it require registration of the host somewhere in the Ansible world ?
Additional NOTES:
Testing the playbook I've observed that the key-pair gets created, the server startup is initiated at AWS as seen from the AWS web console. What appears to be the issue is that the time period of the server to spin up is too long and the script timeouts or fails. Frustratingly, is that the error message is not all that helpful and also wondering if there is any other methods of debugging an ansible script ?
this isn't a problem of "detecting the server is running". As the error message says, it's a problem with syntax.
# bad
wait_for: host = {{item.public_dns_name}} port 22 delay=60 timeout=320 state=started
# good
wait_for: host={{item.public_dns_name}} port=22 delay=60 timeout=320 state=started
Additionally, you'll want to run this from the central machine, not the remote (new) server.
local_action: wait_for host={{item.public_dns_name}} port=22 delay=60 timeout=320 state=started
Focusing on the wait_for test as you indicate that the rest is working.
Based on the jobs I have running I would think the issue is with the host name, not with the rest of the code. I use an Ansible server in a protected VPC that has network access to the VPC where the servers start up in, and my wait_for code looks like this (variable name updated to match yours):
- name: wait for instances to listen on port 22
wait_for:
delay: 10
state: started
host: "{{ item.private_ip }}"
port: 22
timeout: 300
with_items: ec2.instances
Trying to use DNS instead of an IP address has always proven to be unreliable for me - if I'm registering DNS as part of a job, it can sometimes take a minute to be resolvable (sometimes instant, sometimes not). Using the IP addresses works every time of course - as long as the networking is set up correctly.
If your Ansible server is in a different region or has to use the external IP to access the new servers, you will of course need to have the relevant security groups and add the new server(s) to those before you can use wait_for.

Ansible - actions BEFORE gathering facts

Does anyone know how to do something (like wait for port / boot of the managed node) BEFORE gathering facts? I know I can turn gathering facts off
gather_facts: no
and THEN wait for port but what if I need the facts while also still need to wait until the node boots up?
Gathering facts is equivalent to running the setup module. You can manually gather facts by running it. It's not documented, but simply add a task like this:
- name: Gathering facts
setup:
In combination with gather_facts: no on playbook level the facts will only be fetched when above task is executed.
Both in an example playbook:
- hosts: all
gather_facts: no
tasks:
- name: Some task executed before gathering facts
# whatever task you want to run
- name: Gathering facts
setup:
Something like this should work:
- hosts: my_hosts
gather_facts: no
tasks:
- name: wait for SSH to respond on all hosts
local_action: wait_for port=22
- name: gather facts
setup:
- continue with my tasks...
The wait_for will execute locally on your ansible host, waiting for the servers to respond on port 22, then the setup module will perform fact gathering, after which you can do whatever else you need to do.
I was trying to figure out how to provision a host from ec2, wait for ssh to come up, and then run my playbook against it. Which is basically the same use case as you have. I ended up with the following:
- name: Provision App Server from Amazon
hosts: localhost
gather_facts: False
tasks:
# #### call ec2 provisioning tasks here ####
- name: Add new instance to host group
add_host: hostname="{{item.private_ip}}" groupname="appServer"
with_items: ec2.instances
- name: Configure App Server
hosts: appServer
remote_user: ubuntu
gather_facts: True
tasks: ----configuration tasks here----
I think the ansible terminology is that I have two plays in a playbook, each operating on a different group of hosts (localhost, and the appServer group)

Ansible ec2 only provision required servers

I've got a basic Ansible playbook like so:
---
- name: Provision ec2 servers
hosts: 127.0.0.1
connection: local
roles:
- aws
- name: Configure {{ application_name }} servers
hosts: webservers
sudo: yes
sudo_user: root
remote_user: ubuntu
vars:
- setup_git_repo: no
- update_apt_cache: yes
vars_files:
- env_vars/common.yml
- env_vars/remote.yml
roles:
- common
- db
- memcached
- web
with the following inventory:
[localhost]
127.0.0.1 ansible_python_interpreter=/usr/local/bin/python
The Provision ec2 servers task does what you'd expect. It creates an ec2 instance; it also creates a host group [webservers] and adds the created instance IP to it.
The Configure {{ application_name }} servers step then configures that server, installing everything I need.
So far so good, this all does exactly what I want and everything seems to work.
Here's where I'm stuck. I want to be able to fire up an ec2 instance for different roles. Ideally I'd create a dbserver, a webserver and maybe a memcached server. I'd like to be able to deploy any part(s) of this infrastructure in isolation, e.g. create and provision just the db servers
The only ways I can think of to make this work... well, they don't work.
I tried simply declaring the host groups without hosts in the inventory:
[webservers]
[dbservers]
[memcachedservers]
but that's a syntax error.
I would be okay with explicitly provisioning each server and declaring the host group it is for, like so:
- name: Provision webservers
hosts: webservers
connection: local
roles:
- aws
- name: Provision dbservers
hosts: dbservers
connection: local
roles:
- aws
- name: Provision memcachedservers
hosts: memcachedservers
connection: local
roles:
- aws
but those groups don't exist until after the respective step is complete, so I don't think that will work either.
I've seen lots about dynamic inventories, but I haven't been able to understand how that would help me. I've also looked through countless examples of ansible ec2 provisioning projects, they are all invariably either provisioning pre-existing ec2 instances, or just create a single instance and install everything on it.
In the end I realised it made much more sense to just separate the different parts of the stack into separate playbooks, with a full-stack playbook that called each of them.
My remote hosts file stayed largely the same as above. An example of one of the playbooks for a specific part of the stack is:
---
- name: Provision ec2 apiservers
hosts: apiservers #important bit
connection: local #important bit
vars:
- host_group: apiservers
- security_group: blah
roles:
- aws
- name: Configure {{ application_name }} apiservers
hosts: apiservers:!127.0.0.1 #important bit
sudo: yes
sudo_user: root
remote_user: ubuntu
vars_files:
- env_vars/common.yml
- env_vars/remote.yml
vars:
- setup_git_repo: no
- update_apt_cache: yes
roles:
- common
- db
- memcached
- web
This means that the first step of each layer's play adds a new host to the apiservers group, with the second step (Configure ... apiservers) then being able to exclude the localhost without getting a no hosts matching error.
The wrapping playbook is dead simple, just:
---
- name: Deploy all the {{ application_name }} things!
hosts: all
- include: webservers.yml
- include: apiservers.yml
I'm very much a beginner w/regards to ansible, so please do take this for what it is, some guy's attempt to find something that works. There may be better options and this could violate best practice all over the place.
ec2_module supports an "exact_count" property, not just a "count" property.
It will create (or terminate!) instances that match specified tags ("instance_tags")

Resources