How to find a string in ansible "retries and until" block? - ansible

I am installing some plugins and then checking the status in a command loop. I want to check the result of the status of the command and if the plugins are not installed I want to install it again with the help of retry module.
- name: install plugins
command: "run {{ item }}"
with_items:
- install plugins
- status
register: result
until: result.stdout.find("InstallPlugin1 and InstallPlugin2") != -1
retries: 5
delay: 10
I am using register to save the result and I know register saves the result in results and in this case it will save the result in "results" dict. Now I want to check a string in result of status command in until, which should be the 2nd value of results dictionary but I am not able to grab it.
when I use
debug: msg="{{ result['results'][1]['stdout'] }}"
I can see the output of the status command but I dont know how to use this in until module. whenever I use results there it gives an error. I want to use something like
until: result['results'][1]['stdout'].find("all systems go") != -1

If both run install plugins and run status return something like
installed: InstallPlugin1, InstallPlugin2
the task below will do the job
- name: install plugins
command: "run {{ item }}"
loop:
- install plugins
- status
register: result
until:
- result.stdout is search('InstallPlugin1')
- result.stdout is search('InstallPlugin2')
retries: 5
delay: 10
It's not possible to use the loop if only run status returns the confirmation, because the until statement is evaluated in each iteration. An option would be to concatenate the commands. For example
- name: install plugins
command: "run install plugins; run status"
register: result
until:
- result.stdout is search('InstallPlugin1')
- result.stdout is search('InstallPlugin2')
retries: 5
delay: 10
It's possible to test the registered result in each loop. After the loop is done the variable result will keep accumulated result.results. It might be worth to review it.
- debug:
var: result

I think this is what you're looking for:
until: "all systems go" in item['stdout']
The register statement you have there will be a list of the aggregate results from all irritations in the with_items loop and what you want to conditional on is the item itself. Depending on what what you're doing, you might not even need to register that variable.

Related

Write task output to file while using "until"

I have an ansible task that fails about 20% of the time. It almost always succeeds if retried a couple of times. I'd like to use until to loop until the task succeeds and store the output of each attempt to a separate log file on the local machine. Is there a good way to achieve this?
For example, my task currently looks like this:
- name: Provision
register: prov_ret
until: prov_ret is succeeded
retries: 2
command: provision_cmd
I can see how to store the log output from the last retry when it succeeds, but I'd like to store it from each retry. To store from the last attempt to run the command I use:
- name: Write Log
local_action: copy content={{ prov_ret | to_nice_json }} dest="/tmp/ansible_logs/provision.log"
It's not possible as of 2.9. The until loop doesn't preserve results as loop does. Once a task terminates all variables inside this task will be gone except the register one.
To see what's going on in the loop write a log inside the command at the remote host. For example, the command provision_cmd writes a log to /scratch/provision_cmd.log. Run it in the block and display the log in the rescue section.
- block:
- name: Provision
command: provision_cmd
register: prov_ret
until: prov_ret is succeeded
retries: 2
rescue:
- name: Display registered variable
debug:
var: prov_ret
- name: Read the log
slurp:
src: /scratch/provision_cmd.log
register: provision_cmd_log
- name: Display log
debug:
msg: "{{ msg.split('\n') }}"
vars:
msg: "{{ provision_cmd_log.content|b64decode }}"

ansible sh module does not report output until shell completes

How can I see realtime output from a shell script run by ansible?
I recently refactored a wait script to use multiprocessing and provide realtime status of the various service wait checks for multiple services.
As a stand alone script, it works as expecting providing status for each thread as they wait in parallel for various services to get stable.
In ansible, the output pauses until the python script completes (or terminates) and then provides the output. While, OK, it I'd rather find a way to display output sooner. I've tried setting PYTHONUNBUFFERED prior to running ansible-playbook via jenkins withEnv but that doesn't seem to accomplish the goal either
- name: Wait up to 30m for service stability
shell: "{{ venv_dir }}/bin/python3 -u wait_service_state.py"
args:
chdir: "{{ script_dir }}"
What's the standard ansible pattern for displaying output for a long running script?
My guess is that I could follow one of these routes
Not use ansible
execute in a docker container and report output via ansible provided this doesn't hit the identical class of problem
Output to a file from the script and have either ansible thread or Jenkins pipeline thread watch and tail the file (both seem kludgy as this blurs the separation of concerns coupling my build server to the deploy scripts a little too tightly)
You can use - https://docs.ansible.com/ansible/latest/user_guide/playbooks_async.html
main.yml
- name: Run items asynchronously in batch of two items
vars:
sleep_durations:
- 1
- 2
- 3
- 4
- 5
durations: "{{ item }}"
include_tasks: execute_batch.yml
loop: "{{ sleep_durations | batch(2) | list }}"
execute_batch.yml
- name: Async sleeping for batched_items
command: sleep {{ async_item }}
async: 45
poll: 0
loop: "{{ durations }}"
loop_control:
loop_var: "async_item"
register: async_results
- name: Check sync status
async_status:
jid: "{{ async_result_item.ansible_job_id }}"
loop: "{{ async_results.results }}"
loop_control:
loop_var: "async_result_item"
register: async_poll_results
until: async_poll_results.finished
retries: 30
"What's the standard ansible pattern for displaying output for a long running script?"
Standard ansible pattern for displaying output for a long-running script is polling async and loop until async_status finishes. The customization of the until loop's output is limited. See Feature request: until for blocks #16621.
ansible-runner is another route that might be followed.

Checking the status of Ansible processes

The machine I am targeting should, in theory, have a process running for each individual client called 'marketaccess {client_name}' and I want to ensure that this process is running. Ansible is proving very challenging for checking if processes are running. Below is the playbook I am trying to use to see if there is a process running on a given machine. I plan to then run a conditional on the 'stdout' and say that if it does not contain the customer's name then run a restart process script against that given customer. The issue is that when I run this playbook it tells me that the dictionary object has no attribute 'stdout' yet when I remove the '.stdout' it runs fine and I can clearly see the stdout value for service_status.
- name: Check if process for each client exists
shell: ps aux | grep {{ item.key|lower }}
ignore_errors: yes
changed_when: false
register: service_status
with_dict: "{{ customers }}"
- name: Report status of service
debug:
msg: "{{ service_status.stdout }}"
Your problem is that service_status is a result of a looped task, so it has service_status.results list which contains results for every iteration.
To see stdout for every iteration, you can use:
- name: Report status of service
debug:
msg: "{{ item.stdout }}"
with_items: "{{ service_status.results }}"
But you may want to read this note about idempotent shell tasks and rewrite your code with clean single task.

conditional execution of playbook based on exit status of the command module?

I am trying to write a playbook which would execute some tasks only if a certain package is installed on the hosts.
Is it possible to register the output from a command module and run the tasks depending upon the exit status of the command ?
Something like this:
You are on the right path. If httpd doesnt exist, the playbook execution will fail. You can use ignore_errors to continue execution and then run subsequent tasks based on the return code of httpd_result. I have given an example below:
- hosts: localhost
tasks:
- command: "which httpd"
register: httpd_result
ignore_errors: true
- debug: msg="found http"
when: httpd_result.rc == 0
- debug: msg="not found httpd"
when: httpd_result.rc!=0
Here, instead of debug statements, you can put whatever conditional tasks you need to run. Hope this helps.

Ansible integer variables in YAML

I'm using Ansible to deploy a webapp. I'd like to wait for the application to be running by checking that a given page returns a JSON with a given key/value.
I want the task to be tried a few times before failing. I'm therefore using the combination of until/retries/delay keybwords.
Issue is, I want the number of retries to be taken from a variable. If I write :
retries: {{apache_test_retries}}
I fall into the usual Yaml Gotcha (http://docs.ansible.com/YAMLSyntax.html#gotchas).
If, instead, I write:
retries: "{{apache_test_retries}}"
I'm being said the value is not an integer.
ValueError: invalid literal for int() with base 10: '{{apache_test_retries}}'
Here is my full code:
- name: Wait for the application to be running
local_action:
uri
url=http://{{webapp_url}}/health
timeout=60
register: res
sudo: false
when: updated.changed and apache_test_url is defined
until: res.status == 200 and res['json'] is defined and res['json']['status'] == 'UP'
retries: "{{apache_test_retries}}"
delay: 1
Any idea on how to work around this issue? Thanks.
I had the same issue and tried a bunch of things that didn't work so for some time I just worked around without using a variable but found the answer so for everyone who has it.
Daniels solution indeed should work:
retries: "{{ apache_test_retries | int }}"
However, if you are running a little older version of Ansible it won't work. So make sure you update Ansible. I tested on 1.8.4 and it works and it doesn't on 1.8.2
This was the original bug on ansible:
https://github.com/ansible/ansible/issues/5865
You should be able to convert it to an integer with the int filter:
retries: "{{ apache_test_retries | int }}"
I had the same problem and the solutions suggested here didn't work. I didn't try Tim Diels' suggestion though.
Here's what worked for me:
vars:
capacity: "{{ param_capacity | default(16) }}"
tasks:
- name: some task
...
when: item.usage < (capacity | int)
loop:
...
And here's what I was trying to do:
vars:
capacity: "{{ (param_capacity | default(16)) | int }}"
tasks:
- name: some task
...
when: item.usage < capacity
loop:
...
I found this issue on GitHub, about this same problem, and actually the intended way to use this filter is applying it where you use the variable, not where you declare it.
I have faced a similar issue, in my case I wanted to restart celeryd service. It sometimes takes a very long time to restart and I wanted to give it max 30 seconds for a soft restart, then force-restart it. I used async for this (polling for restart result every 5 seconds).
celery/handlers/main.yml
- name: restart celeryd
service:
name=celeryd
state=restarted
register: celeryd_restart_result
ignore_errors: true
async: "{{ async_val | default(30) }}"
poll: 5
- name: check celeryd restart result and force restart if needed
shell: service celeryd kill && service celeryd start
when: celeryd_restart_result|failed
And then I use above in the playbook as handlers to a task (restart celeryd is always first in notify list)
In your case something like below could possibly work. Haven't checked whether it does but it might give you some hack idea to solve it in a different way. Also since you will be ignoring errors in the 1st task, you need to make sure that things are fine in 2nd:
- name: Poll to check if the application is running
local_action:
uri
url=http://{{webapp_url}}/health
timeout=60
register: res
sudo: false
when: updated.changed and apache_test_url is defined
failed_when: res.status != 200 and res['json'] is not defined and not res['json']['status'] == 'UP'
ignore_errors: true
async: "{{ apache_test_retries | default(60) }}"
poll: 1
# Task above will exit as early as possible on success
# It will keep trying for 60 secs, polling every 1 sec
# You need to make sure it's fine **again** because it has ignore_errors: true
- name: Final UP check
local_action:
uri
url=http://{{webapp_url}}/health
timeout=60
register: res
sudo: false
when: updated.changed and apache_test_url is defined
failed_when: res.status != 200 and res['json'] is not defined and not res['json']['status'] == 'UP'
Hope it helps you solve the issue with a bug in retries.

Resources