Checking the status of Ansible processes - ansible

The machine I am targeting should, in theory, have a process running for each individual client called 'marketaccess {client_name}' and I want to ensure that this process is running. Ansible is proving very challenging for checking if processes are running. Below is the playbook I am trying to use to see if there is a process running on a given machine. I plan to then run a conditional on the 'stdout' and say that if it does not contain the customer's name then run a restart process script against that given customer. The issue is that when I run this playbook it tells me that the dictionary object has no attribute 'stdout' yet when I remove the '.stdout' it runs fine and I can clearly see the stdout value for service_status.
- name: Check if process for each client exists
shell: ps aux | grep {{ item.key|lower }}
ignore_errors: yes
changed_when: false
register: service_status
with_dict: "{{ customers }}"
- name: Report status of service
debug:
msg: "{{ service_status.stdout }}"

Your problem is that service_status is a result of a looped task, so it has service_status.results list which contains results for every iteration.
To see stdout for every iteration, you can use:
- name: Report status of service
debug:
msg: "{{ item.stdout }}"
with_items: "{{ service_status.results }}"
But you may want to read this note about idempotent shell tasks and rewrite your code with clean single task.

Related

Write task output to file while using "until"

I have an ansible task that fails about 20% of the time. It almost always succeeds if retried a couple of times. I'd like to use until to loop until the task succeeds and store the output of each attempt to a separate log file on the local machine. Is there a good way to achieve this?
For example, my task currently looks like this:
- name: Provision
register: prov_ret
until: prov_ret is succeeded
retries: 2
command: provision_cmd
I can see how to store the log output from the last retry when it succeeds, but I'd like to store it from each retry. To store from the last attempt to run the command I use:
- name: Write Log
local_action: copy content={{ prov_ret | to_nice_json }} dest="/tmp/ansible_logs/provision.log"
It's not possible as of 2.9. The until loop doesn't preserve results as loop does. Once a task terminates all variables inside this task will be gone except the register one.
To see what's going on in the loop write a log inside the command at the remote host. For example, the command provision_cmd writes a log to /scratch/provision_cmd.log. Run it in the block and display the log in the rescue section.
- block:
- name: Provision
command: provision_cmd
register: prov_ret
until: prov_ret is succeeded
retries: 2
rescue:
- name: Display registered variable
debug:
var: prov_ret
- name: Read the log
slurp:
src: /scratch/provision_cmd.log
register: provision_cmd_log
- name: Display log
debug:
msg: "{{ msg.split('\n') }}"
vars:
msg: "{{ provision_cmd_log.content|b64decode }}"

ansible sh module does not report output until shell completes

How can I see realtime output from a shell script run by ansible?
I recently refactored a wait script to use multiprocessing and provide realtime status of the various service wait checks for multiple services.
As a stand alone script, it works as expecting providing status for each thread as they wait in parallel for various services to get stable.
In ansible, the output pauses until the python script completes (or terminates) and then provides the output. While, OK, it I'd rather find a way to display output sooner. I've tried setting PYTHONUNBUFFERED prior to running ansible-playbook via jenkins withEnv but that doesn't seem to accomplish the goal either
- name: Wait up to 30m for service stability
shell: "{{ venv_dir }}/bin/python3 -u wait_service_state.py"
args:
chdir: "{{ script_dir }}"
What's the standard ansible pattern for displaying output for a long running script?
My guess is that I could follow one of these routes
Not use ansible
execute in a docker container and report output via ansible provided this doesn't hit the identical class of problem
Output to a file from the script and have either ansible thread or Jenkins pipeline thread watch and tail the file (both seem kludgy as this blurs the separation of concerns coupling my build server to the deploy scripts a little too tightly)
You can use - https://docs.ansible.com/ansible/latest/user_guide/playbooks_async.html
main.yml
- name: Run items asynchronously in batch of two items
vars:
sleep_durations:
- 1
- 2
- 3
- 4
- 5
durations: "{{ item }}"
include_tasks: execute_batch.yml
loop: "{{ sleep_durations | batch(2) | list }}"
execute_batch.yml
- name: Async sleeping for batched_items
command: sleep {{ async_item }}
async: 45
poll: 0
loop: "{{ durations }}"
loop_control:
loop_var: "async_item"
register: async_results
- name: Check sync status
async_status:
jid: "{{ async_result_item.ansible_job_id }}"
loop: "{{ async_results.results }}"
loop_control:
loop_var: "async_result_item"
register: async_poll_results
until: async_poll_results.finished
retries: 30
"What's the standard ansible pattern for displaying output for a long running script?"
Standard ansible pattern for displaying output for a long-running script is polling async and loop until async_status finishes. The customization of the until loop's output is limited. See Feature request: until for blocks #16621.
ansible-runner is another route that might be followed.

Ansible multiple when condition failing

in my playbook the first task will find some files and if found register them in a variable and the second task will remove the files via a command passed via the shell, the issue is the second task always errors even when the variable cleanup is set to false. This is the playbook:
tasks:
- name: Find tables
find:
paths: "{{ file path }}"
age: "1d"
recurse: yes
file_type: directory
when: cleanup
register: cleanup_files
- name: cleanup tables
shell: /bin/cleanup {{ item.path | basename }}
with_items: "{{ cleanup_files.files }} "
when: "cleanup or item is defined"
when cleanup is set to false the first task is skipped but the second errors saying: "failed": true, "msg": "'dict object' has no attribute 'files'"}.
item would be defined as the task above didnt run so should it still not skip the task as cleanup is set to false?
ive noticed if i change the or to and in the second task it skips the task fine. im not sure why.
Change the playbook to this code (changes in the second task), and after the code you can see the logic behind the changes:
tasks:
- name: Find tables
find:
paths: "/tmp"
age: "1000d"
recurse: yes
file_type: directory
when: cleanup
register: cleanup_files
- debug: var=cleanup_files
- name: cleanup tables
debug: msg="file= {{ item.path | basename }}"
when: "cleanup_files.files is defined"
with_items: "{{ cleanup_files.files }} "
When you execute with cleanup=false, the find task will register its results to the cleanup_files, but you will notice it doesn't have a cleanup_files.files attribute. When you execute with cleanup=true, you will get the cleanup_files.files, it will be empty if no files found to meet the find criteria.
So, second task needs to know only if cleanup_files.files is defined, and if defined, it can proceed to run. If no files were found to meet the criteria, the with_items clause will handle it properly (no files=> no iterations).
I have added a debug task to check the cleanup_files, you can run and see its structure, when:
cleanup=true
cleanup=false
I think you need to change the second WHEN to
when: "cleanup and cleanup_files.files is defined"
You might also consider making cleanup a tag.

conditionally run tasks when given multiple input to a variable

I have written a ansible script which runs fine when there is only 1 input to a variable:
---
- hosts: ListA
vars:
app-dir: /tmp
service_name: exampleAAA
roles:
- prechecks
Below is the task i am using and working when only one service defined for service_name:
---
- name: check service status
command: "{{app_dir}}/app-name {{item}} status"
with_items: '{{service_name}}'
ignore_errors: yes
register: service_status
- name: starting service if it's in failed state
set_fact: serviceTostart={{item}}
with_items: '{{service_name}}'
when: service_status | failed
- shell: "{{app_dir}}/app-name {{serviceTostart}} start"
when: service_status | failed
As per my usecase i need this to work for below:
vars:
service_name:
- exampleAAA
- exampleBBB
- exampleCCC
When i run the playbook after defining multiple service_name. it shows failed status of service in step check service status but it says ok in rest of the steps. When i check the status of services there is no change. How can i make it work for multiple service_names ???
So here i what the script should do(I am stuck with points 2 & 3, can someone please let me know what need to be done to make it work):
The script will check the status of all the services mentioned (it is doing this correctly)
If one of the service status shows as stop. It will go the tasks which will run the command to bring back that particular service.
If after one start the service still does not come up then script should fail ( I am yet to write code for this part).
Honestly the answer to your question is in the documentation: Using register with a loop.
- name: check service status
command: "{{app_dir}}/app-name {{item}} status"
with_items: "{{service_name}}"
ignore_errors: yes
register: service_status
- shell: "{{app_dir}}/app-name {{item.item}} start"
when: item | failed
with_items: "{{service_status.results}}"

Ansible: Different variables for whole inventory

I'm trying to get it to work for more than just one host. Is it possible?
In this case it will be failing on another hosts of course, because process id would be different on another hosts.
- name: Fetching PID file from remote server
fetch: src="some.pid" dest=/tmp/ flat=yes fail_on_missing=yes
register: result
ignore_errors: True
- name: Is pid_file matching process ?
wait_for: path=/proc/{{ lookup('file', '/tmp/some.pid') }}/status state=present
when: result|success
register: result2
ignore_errors: True
Formal answer to your question – to make it work on multiple hosts at once, use templated filename or flat=no, e.g.:
- name: Fetching PID file from remote server
fetch: src="some.pid" dest="/tmp/{{inventory_hostname}}.pid" flat=yes fail_on_missing=yes
register: result
ignore_errors: True
- name: Is pid_file matching process ?
wait_for: path="/proc/{{ lookup('file', '/tmp/'+inventory_hostname+'.pid') }}/status" state=present
when: result|success
register: result2
ignore_errors: True
But a better way is to replace this tasks with one shell command that checks everything at once without fetching any files to ansible host.
You can also use retry/until clauses with commands to wait for specific command result.

Resources