ansible sh module does not report output until shell completes - ansible

How can I see realtime output from a shell script run by ansible?
I recently refactored a wait script to use multiprocessing and provide realtime status of the various service wait checks for multiple services.
As a stand alone script, it works as expecting providing status for each thread as they wait in parallel for various services to get stable.
In ansible, the output pauses until the python script completes (or terminates) and then provides the output. While, OK, it I'd rather find a way to display output sooner. I've tried setting PYTHONUNBUFFERED prior to running ansible-playbook via jenkins withEnv but that doesn't seem to accomplish the goal either
- name: Wait up to 30m for service stability
shell: "{{ venv_dir }}/bin/python3 -u wait_service_state.py"
args:
chdir: "{{ script_dir }}"
What's the standard ansible pattern for displaying output for a long running script?
My guess is that I could follow one of these routes
Not use ansible
execute in a docker container and report output via ansible provided this doesn't hit the identical class of problem
Output to a file from the script and have either ansible thread or Jenkins pipeline thread watch and tail the file (both seem kludgy as this blurs the separation of concerns coupling my build server to the deploy scripts a little too tightly)

You can use - https://docs.ansible.com/ansible/latest/user_guide/playbooks_async.html
main.yml
- name: Run items asynchronously in batch of two items
vars:
sleep_durations:
- 1
- 2
- 3
- 4
- 5
durations: "{{ item }}"
include_tasks: execute_batch.yml
loop: "{{ sleep_durations | batch(2) | list }}"
execute_batch.yml
- name: Async sleeping for batched_items
command: sleep {{ async_item }}
async: 45
poll: 0
loop: "{{ durations }}"
loop_control:
loop_var: "async_item"
register: async_results
- name: Check sync status
async_status:
jid: "{{ async_result_item.ansible_job_id }}"
loop: "{{ async_results.results }}"
loop_control:
loop_var: "async_result_item"
register: async_poll_results
until: async_poll_results.finished
retries: 30

"What's the standard ansible pattern for displaying output for a long running script?"
Standard ansible pattern for displaying output for a long-running script is polling async and loop until async_status finishes. The customization of the until loop's output is limited. See Feature request: until for blocks #16621.
ansible-runner is another route that might be followed.

Related

What's the fastest method to add Linux users with Ansible?

I have a users.yaml file with information regarding 400+ users. I need Ansible to create these users during provisioning. I tried with the async keyword (if that's the right word to use, tell me if I'm wrong) and poll: 15 but it takes ~10minutes.
- name: Add FTP users asynchronously
ansible.builtin.user:
name: "{{ item.name }}"
home: "{{ item.home }}"
shell: /sbin/nologin
groups: ftp-users
create_home: yes
append: no
loop: "{{ ftp_users }}"
async: 60
poll: 15
tags: users
I also tried using poll:0 but many users aren't created.
Your actual use of async is adapted to a single long running task use case where you want to minimize the chance of getting your connection kicked because of a timeout. You are asking ansible to start a job, disconnect from the target and then reconnect every 15 seconds to check if the job is done (or until you reach the 60 seconds timeout). Nothing will be launched in parallel: the next iteration in the loop will only start when the current is done.
What you want to do instead is run those tasks in parallel as fast as possible and then check back later if they are done. In this case, you have to use poll: 0 on your task and later check for completion with the async_status module as described on the ansible async guide. Note that you also need to cleanup the async job cache as ansible will not do it automagically for you in that case.
In your case, this would give:
- name: Add FTP users asynchronously
ansible.builtin.user:
name: "{{ item.name }}"
home: "{{ item.home }}"
shell: /sbin/nologin
groups: ftp-users
create_home: yes
append: no
loop: "{{ ftp_users }}"
async: 60
poll: 0
register: add_user
- name: Wait until all commands are done
async_status:
jid: "{{ item.ansible_job_id }}"
register: async_poll_result
until: async_poll_result.finished
retries: 60
delay: 1
loop: "{{ add_user.results }}"
- name: clean async job cache
async_status:
jid: "{{ item.ansible_job_id }}"
mode: cleanup
loop: "{{ add_user.results }}"
Meanwhile, although this is a direct answer on how to use async for parallel jobs, I'm not entirely sure this will fix your actual performance problem which could come from other issues (like slow dns, slow network, pipelining not enabled if that is possible, master ssh connection not configured...)

Write task output to file while using "until"

I have an ansible task that fails about 20% of the time. It almost always succeeds if retried a couple of times. I'd like to use until to loop until the task succeeds and store the output of each attempt to a separate log file on the local machine. Is there a good way to achieve this?
For example, my task currently looks like this:
- name: Provision
register: prov_ret
until: prov_ret is succeeded
retries: 2
command: provision_cmd
I can see how to store the log output from the last retry when it succeeds, but I'd like to store it from each retry. To store from the last attempt to run the command I use:
- name: Write Log
local_action: copy content={{ prov_ret | to_nice_json }} dest="/tmp/ansible_logs/provision.log"
It's not possible as of 2.9. The until loop doesn't preserve results as loop does. Once a task terminates all variables inside this task will be gone except the register one.
To see what's going on in the loop write a log inside the command at the remote host. For example, the command provision_cmd writes a log to /scratch/provision_cmd.log. Run it in the block and display the log in the rescue section.
- block:
- name: Provision
command: provision_cmd
register: prov_ret
until: prov_ret is succeeded
retries: 2
rescue:
- name: Display registered variable
debug:
var: prov_ret
- name: Read the log
slurp:
src: /scratch/provision_cmd.log
register: provision_cmd_log
- name: Display log
debug:
msg: "{{ msg.split('\n') }}"
vars:
msg: "{{ provision_cmd_log.content|b64decode }}"

Run Ansible Taks for several Groups in Parallel

We have many similar hosts that are grouped to specific types.
Every group has several hosts in it, mostly 2 to 8 for scalability within the type.
Now we need to run the same tasks/role on all these hosts.
Serialised within each group but all groups at the same time.
This should run much faster than all groups (about 10 groups currently) in a row.
Is this possible today with Ansible?
Maybe. I'm afriad I do not have the ability to test this idea, but here goes....
Let's say you have GroupA and GroupB. To ping each host in a group serially, but have the groups run in parallel, you could try this hideous construct:
---
- hosts: localhost
tasks:
- ping:
delegate_to: "{{ item }}"
with_items: "{{ groups['groupA'] }}"
forks: 1
async: 0
poll: 0
- ping:
delegate_to: "{{ item }}"
with_items: "{{ groups['groupB'] }}"
forks: 1
async: 0
poll: 0
Ansible is still going to show the task output separately.
When I ran this, files were created in /home/ansible/.ansible_async. Those files show the task start times, and it looked like it worked. To verify, I ran shell: sleep 5 instead of ping:, and saw the start times in those files properly interleaved.
Good luck.

How to find a string in ansible "retries and until" block?

I am installing some plugins and then checking the status in a command loop. I want to check the result of the status of the command and if the plugins are not installed I want to install it again with the help of retry module.
- name: install plugins
command: "run {{ item }}"
with_items:
- install plugins
- status
register: result
until: result.stdout.find("InstallPlugin1 and InstallPlugin2") != -1
retries: 5
delay: 10
I am using register to save the result and I know register saves the result in results and in this case it will save the result in "results" dict. Now I want to check a string in result of status command in until, which should be the 2nd value of results dictionary but I am not able to grab it.
when I use
debug: msg="{{ result['results'][1]['stdout'] }}"
I can see the output of the status command but I dont know how to use this in until module. whenever I use results there it gives an error. I want to use something like
until: result['results'][1]['stdout'].find("all systems go") != -1
If both run install plugins and run status return something like
installed: InstallPlugin1, InstallPlugin2
the task below will do the job
- name: install plugins
command: "run {{ item }}"
loop:
- install plugins
- status
register: result
until:
- result.stdout is search('InstallPlugin1')
- result.stdout is search('InstallPlugin2')
retries: 5
delay: 10
It's not possible to use the loop if only run status returns the confirmation, because the until statement is evaluated in each iteration. An option would be to concatenate the commands. For example
- name: install plugins
command: "run install plugins; run status"
register: result
until:
- result.stdout is search('InstallPlugin1')
- result.stdout is search('InstallPlugin2')
retries: 5
delay: 10
It's possible to test the registered result in each loop. After the loop is done the variable result will keep accumulated result.results. It might be worth to review it.
- debug:
var: result
I think this is what you're looking for:
until: "all systems go" in item['stdout']
The register statement you have there will be a list of the aggregate results from all irritations in the with_items loop and what you want to conditional on is the item itself. Depending on what what you're doing, you might not even need to register that variable.

Checking the status of Ansible processes

The machine I am targeting should, in theory, have a process running for each individual client called 'marketaccess {client_name}' and I want to ensure that this process is running. Ansible is proving very challenging for checking if processes are running. Below is the playbook I am trying to use to see if there is a process running on a given machine. I plan to then run a conditional on the 'stdout' and say that if it does not contain the customer's name then run a restart process script against that given customer. The issue is that when I run this playbook it tells me that the dictionary object has no attribute 'stdout' yet when I remove the '.stdout' it runs fine and I can clearly see the stdout value for service_status.
- name: Check if process for each client exists
shell: ps aux | grep {{ item.key|lower }}
ignore_errors: yes
changed_when: false
register: service_status
with_dict: "{{ customers }}"
- name: Report status of service
debug:
msg: "{{ service_status.stdout }}"
Your problem is that service_status is a result of a looped task, so it has service_status.results list which contains results for every iteration.
To see stdout for every iteration, you can use:
- name: Report status of service
debug:
msg: "{{ item.stdout }}"
with_items: "{{ service_status.results }}"
But you may want to read this note about idempotent shell tasks and rewrite your code with clean single task.

Resources