Make ansible wait for server to start, without logging in - ansible

When I provision a new server, there is a lag between the time I create it and it becomes available. So I need to wait until it's ready.
I assumed that was the purpose of the wait_for task:
hosts:
[servers]
42.42.42.42
playbook.yml:
---
- hosts: all
gather_facts: no
tasks:
- name: wait until server is up
wait_for: port=22
This fails with Permission denied. I assume that's because nothing is setup yet.
I expected it to open an ssh connection and wait for the prompt - just to see if the server is up. But what actually happens is it tries to login.
Is there some other way to perform a wait that doesn't try to login?

as you correctly stated, this task executes on the "to be provisioned" host, so ansible tries to connect to it (via ssh) first, then would try to wait for the port to be up. this would work for other ports/services, but not for 22 on a given host, since 22 is a "prerequisite" for executing any task on that host.
what you could do is try to delegate_to this task to the ansible host (that you run the PB) and add the host parameter in the wait_for task.
Example:
- name: wait until server is up
wait_for:
port: 22
host: <the IP of the host you are trying to provision>
delegate_to: localhost
hope it helps

Q: "Is there some other way to perform a wait that doesn't try to login?"
A: It is possible to wait_for_connection. For example
- hosts: all
gather_facts: no
tasks:
- name: wait until server is up
wait_for_connection:
delay: 60
timeout: 300

Related

Ansible wait_for_connection until the hosts are ready for ansible?

I am using ansible to configure some VM's.
Problem I am facing right now is, I can't execute ansible commands right after the VM's are just started, it gives connection time out error. This happens when I execute the ansible right after the VMs are spinned up in GCP.
Commands working fine when I execute ansible playbook after 60 seconds, but I am looking for a way to do this automatically without manually wait 60s and execute, so I can execute right after VM's are spun up and ansible will wait until they are ready. I don't want to add a delay seconds to ansible tasks as well,
I am looking for a dynamic way where ansible tries to execute playbook and when it fails, it won't show any error but wait until the VM's are ready?
I used this, but it still doesn't work (as it fails)
---
- hosts: all
tasks:
- name: Wait for connection
wait_for_connection: # but this will still fails, am I doing this wrong?
- name: Ping all hosts for connectivity check
ping:
Can someone please help me?
I have the same issue on my side.
I've fixed htis with this task wait_for.
The basic way is to waiting ssh connection like this :
- name: Wait 300 seconds for port 22 to become open and contain "OpenSSH"
wait_for:
port: 22
host: '{{ (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }}'
search_regex: OpenSSH
delay: 10
connection: local
I guess your VM must launch an application/service so you can monitor on the vm in the log file where application is started, like this for example (here for nexus container):
- name: Wait container is start and running
become: yes
become_user: "{{ ansible_nexus_user }}"
wait_for:
path: "{{ ansible_nexus_directory_data }}/log/nexus.log"
search_regex: ".*Started Sonatype Nexus.*"
I believe what you are looking for is to postpone gather_facts until the server is up, as that otherwise will time out as you experienced. Your file could work as follows:
---
- hosts: all
gather_facts: no
tasks:
- name: Wait for connection (600s default)
ansible.builtin.wait_for_connection:
- name: Gather facts manually
ansible.builtin.wait_for_connection
I have these under pre_tasks instead of tasks, but it should probably work if they are first in your file.

How to wait for tomcat server to start up in Ansible playbook

It may be a weird question and I have tried searching but couldn't find what I am looking for.
I have a below playbook which starts tomcat and checks the status. How do i put a condition to wait for it to become online before my playbook finishes. I have looked into wait_for module but not able to put up in the playbook.
---
- name: Starting tomcat service on remote host
shell: "svcadm enable tomcat"
ignore_errors: true
- pause:
seconds: 10
- name: Check the State of tomcat service on the remote host
shell: "svcs tomcat"
register: tomcat_status
- set_fact:
tomcat_state: "{{ tomcat_status.stdout_lines.1.split().0 }}"
The value of tomcat_state should be "online" which tells us that tomcat is up.
Is there anything we can do here OR may be some other way? Appreciate if someone can give some inputs
- name: wait for completion
wait_for:
port: <your_port>
delay: 60
timeout: 500
In this way, this task wait for 60 secs, then check if something is up on the port you specified, and it tries for 500s

validating log file before continuing the playbook execution

I want to look for particular sentence '*****--Finished Initialization**' in log before starting the application on next host. The tail command will never stop printing the data since the application will be used by some process immediately after application start and data will be logged.
how can I validate that before restarting the application on the second host?
As of now, I have skipped the tail command and given a timeout of 3 minutes.
- name: playbook to restart the application on hosts
hosts: host1, host2
tags: ddapp
connection: ssh
gather_facts: no
tasks:
- name: start app and validate before proceeding
shell: |
sudo systemctl start tomcat#application
#tail –f application_log.txt
wait_for: timeout=180
#other shell commands
args:
chdir: /path/to/files/directory
Use wait_for module:
- name: playbook to restart the application on hosts
hosts: host1, host2
tags: ddapp
connection: ssh
gather_facts: no
tasks:
- name: start app
become: yes
service:
name: tomcat#application
state: started
- name: validate before proceeding
wait_for:
path: /path/to/files/directory/application_log.txt
search_regex: Finished Initialization
Notice if the log is not cleared between app restarts and contains multiple Finished Initialization strings inside, refer to this question.
You have to use the wait_for module to look for a particular string with regex
- name: start app
service:
state: started
name: tomcat#application
become: true
- name: Wait for application to be ready
wait_for:
search_regex: '\*\*\*\*\*--Finished Initialization\*\*'
path: /you/path/to/application_log.txt
wait_for can also be used to detect for file apparition (like pid) or network port being opened (or not).
also, always prefer using native module instead of using a shell script to handle deployment or action. That why I replace the shell with service module.

How to write an Ansible playbook with port knocking

My server is set up to require port knocking in order to white-list an IP for port 22 SSH. I've found guides on setting up an Ansible playbook to configure port knocking on the server side, but not to perform port knocking on the client side.
For example, what would my playbook and/or inventory files look like if I need to knock port 9999, 9000, then connect to port 22 in order to run my Ansible tasks?
You can try out my ssh_pkn connection plugin.
# Example host definition:
# [pkn]
# myserver ansible_host=my.server.at.example.com
# [pkn:vars]
# ansible_connection=ssh_pkn
# knock_ports=[8000,9000]
# knock_delay=2
I have used https://stackoverflow.com/a/42647902/10191134 until it broke on an ansible update so I searched for another solution and finally stumbled over wait_for:
hosts:
[myserver]
knock_ports=[123,333,444]
play:
- name: Port knocking
wait_for:
port: "{{ item }}"
delay: 0
connect_timeout: 1
state: stopped
host: "{{ inventory_hostname }}"
connection: local
become: no
with_items: "{{ knock_ports }}"
when: knock_ports is defined
ofc can be adjusted to make the delay and/or timeout configurable in the hosts as well.
Here's a brute-force example. The timeouts will be hit, so this'll add 2 seconds per host to a play.
- hosts: all
connection: local
tasks:
- uri:
url: "http://{{ansible_host}}:9999"
timeout: 1
ignore_errors: yes
- uri:
url: "http://{{ansible_host}}:9000"
timeout: 1
ignore_errors: yes
- hosts: all
# your normal plays here
Other ways: use telnet, put a wrapper around Ansible (though it isn't recommended in Ansible2), make a role and then include with meta, write a custom module (and pull that back into Ansible itself).

Ansible AWS EC2 Detecting Server is Running Fails

Background:
Just trying to learn how to use Ansible and have been experimenting with the AWS Ec2 module to build and deploy a Ubuntu instance on AWS-EC2. So have built a simple Playbook to create and startup an instance and executed via ansible-playbook -vvvv ic.yml
The playbook is:
---
- name: Create a ubuntu instance on AWS
hosts: localhost
connection: local
gather_facts: False
vars:
# AWS keys for access to the API
ec2_access_key: 'secret-key'
ec2_secret_key: 'secret-key'
region: ap-southeast-2
tasks:
- name: Create a Key-Pair necessary for connection to the remote EC2 host
ec2_key:
name=ic-key region="{{region}}"
register: keypair
- name: Write the Key-Pair to a file for re-use
copy:
dest: files/ic-key.pem
content: "{{ keypair.key.private_key }}"
mode: 0600
when: keypair.changed
- name: start the instance
ec2:
ec2_access_key: "{{ec2_access_key}}"
ec2_secret_key: "{{ec2_secret_key}}"
region: ap-southeast-2
instance_type: t2.micro
image: ami-69631053
key_name: ic-key # key we just created
instance_tags: {Name: icomplain-prod, type: web, env: production} #key-values pairs for naming etc
wait: yes
register: ec2
- name: Wait for instance to start up and be running
wait_for: host = {{item.public_dns_name}} port 22 delay=60 timeout=320 state=started
with_items: ec2.instances
Problem:
The issue is that when attempting to wait for the instance to fire up, using the wait_for test, as described in Examples for EC-2 module it fails with the following error message:
msg: this module requires key=value arguments (['host', '=', 'ec2-52-64-134-61.ap-southeast-2.compute.amazonaws.com', 'port', '22', 'delay=60', 'timeout=320', 'state=started'])
FATAL: all hosts have already failed -- aborting
Output:
Although the error message appears on the command line when I check in the AWS-Console the Key-Pair and EC2 instance are created and running.
Query:
Wondering
There is some other parameter which I need ?
What is the 'key=value' msg which is the error output being caused by?
Any recommendations on other ways to debug the script to determine the cause of the failure ?
Does it require registration of the host somewhere in the Ansible world ?
Additional NOTES:
Testing the playbook I've observed that the key-pair gets created, the server startup is initiated at AWS as seen from the AWS web console. What appears to be the issue is that the time period of the server to spin up is too long and the script timeouts or fails. Frustratingly, is that the error message is not all that helpful and also wondering if there is any other methods of debugging an ansible script ?
this isn't a problem of "detecting the server is running". As the error message says, it's a problem with syntax.
# bad
wait_for: host = {{item.public_dns_name}} port 22 delay=60 timeout=320 state=started
# good
wait_for: host={{item.public_dns_name}} port=22 delay=60 timeout=320 state=started
Additionally, you'll want to run this from the central machine, not the remote (new) server.
local_action: wait_for host={{item.public_dns_name}} port=22 delay=60 timeout=320 state=started
Focusing on the wait_for test as you indicate that the rest is working.
Based on the jobs I have running I would think the issue is with the host name, not with the rest of the code. I use an Ansible server in a protected VPC that has network access to the VPC where the servers start up in, and my wait_for code looks like this (variable name updated to match yours):
- name: wait for instances to listen on port 22
wait_for:
delay: 10
state: started
host: "{{ item.private_ip }}"
port: 22
timeout: 300
with_items: ec2.instances
Trying to use DNS instead of an IP address has always proven to be unreliable for me - if I'm registering DNS as part of a job, it can sometimes take a minute to be resolvable (sometimes instant, sometimes not). Using the IP addresses works every time of course - as long as the networking is set up correctly.
If your Ansible server is in a different region or has to use the external IP to access the new servers, you will of course need to have the relevant security groups and add the new server(s) to those before you can use wait_for.

Resources