Ansible async_status task - error: ansible_job_id "undefined variable" - bash

I have a 3 node ubuntu 20.04 lts - kvm - kubernetes cluster, and the kvm-host is also ubuntu 20.04 lts. I ran the playbooks on the kvm-host.
I have the following inventory extract:
nodes:
hosts:
sea_r:
ansible_host: 192.168.122.60
spring_r:
ansible_host: 192.168.122.92
island_r:
ansible_host: 192.168.122.93
vars:
ansible_user: root
and have been trying a lot with async_status, but always fails,
- name: root commands
hosts: nodes
tasks:
- name: bash commands
ansible.builtin.shell: |
apt update
args:
chdir: /root
executable: /bin/bash
async: 2000
poll: 2
register: output
- name: check progress
ansible.builtin.async_status:
jid: "{{ output.ansible_job_id }}"
register: job_result
until: job_result.finished
retries: 200
delay: 5
with error:
fatal: [sea_r]: FAILED! => {"msg": "The task
includes an option with an undefined variable.
The error was: 'dict object' has no attribute
'ansible_job_id' ...
If instead I try with the following,
- name: root commands
hosts: nodes
tasks:
- name: bash commands
ansible.builtin.shell: |
apt update
args:
chdir: /root
executable: /bin/bash
async: 2000
poll: 2
register: output
- debug: msg="{{ output.stdout_lines }}"
- debug: msg="{{ output.stderr_lines }}"
I get no errors.
Also tried following variation,
- name: check progress
ansible.builtin.async_status:
jid: "{{ item.ansible_job_id }}"
with_items: "{{ output }}"
register: job_result
until: job_result.finished
retries: 200
delay: 5
that was suggested as a solution to similar error. That also does not help, I just get slightly different error:
fatal: [sea_r]: FAILED! => {"msg": "The task includes
an option with an undefined variable. The error
was: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText
object' has no attribute 'ansible_job_id' ...
At the beginning and the end of the playbook, I resume and pause my 3 kvm server nodes like so:
- name: resume vms
hosts: local_vm_ctl
tasks:
- name: resume vm servers
shell: |
virsh resume kub3
virsh resume kub2
virsh resume kub1
virsh list --state-paused --state-running
args:
chdir: /home/bi
executable: /bin/bash
environment:
LIBVIRT_DEFAULT_URI: qemu:///system
register: output
- debug: msg="{{ output.stdout_lines }}"
- debug: msg="{{ output.stderr_lines }}"
and so
- name: pause vms
hosts: local_vm_ctl
tasks:
- name: suspend vm servers
shell: |
virsh suspend kub3
virsh suspend kub2
virsh suspend kub1
virsh list --state-paused --state-running
args:
chdir: /home/bi
executable: /bin/bash
environment:
LIBVIRT_DEFAULT_URI: qemu:///system
register: output
- debug: msg="{{ output.stdout_lines }}"
- debug: msg="{{ output.stderr_lines }}"
but I don't see how these plays could have anything to do with said error.
Any help will be much appreciated.

You get an undefined error for your job id because:
You use poll: X on your initial task, so ansible connects every X seconds to check if the task is finished
When ansible exists that task and enters your next async_status task, the job is done. And since you used a non-zero value to poll the async status cache is automatically cleared.
since the cache was cleared, the job id does not exist anymore.
Your above scenario is meant to be used to avoid timeouts with your target on long running tasks, not to run tasks concurrently and have a later checkpoint on their status. For this second requirement, you need to run the async task with poll: 0 and clean-up the cache by yourself
See the documentation for more explanation on the above concepts:
ansible async guide
ansible async_status module
I made an example with your above task and fixed it to use the dedicated module apt (note that you could add a name option to the module with one or a list of packages and ansible would do both the cache update and install in a single step). Also, retries * delay on the async_status task should be equal or greater than async on the initial task if you want to make sure that you won't miss the end.
- name: Update apt cache
ansible.builtin.apt:
update_cache: true
async: 2000
poll: 0
register: output
- name: check progress
ansible.builtin.async_status:
jid: "{{ output.ansible_job_id }}"
register: job_result
until: job_result.finished
retries: 400
delay: 5
- name: clean async job cache
ansible.builtin.async_status:
jid: "{{ output.ansible_job_id }}"
mode: cleanup
This is more useful to launch a bunch of long lasting tasks in parallel. Here is a useless yet functional example:
- name: launch some loooooong tasks
shell: "{{ item }}"
loop:
- sleep 30
- sleep 20
- sleep 35
async: 100
poll: 0
register: long_cmd
- name: wait until all commands are done
async_status:
jid: "{{ item.ansible_job_id }}"
register: async_poll_result
until: async_poll_result.finished
retries: 50
delay: 2
loop: "{{ long_cmd.results }}"
- name: clean async job cache
async_status:
jid: "{{ item.ansible_job_id }}"
mode: cleanup
loop: "{{ long_cmd.results }}"

You have poll: 2 on your task, which tells Ansible to internally poll the async job every 2 seconds and return the final status in the registered variable. In order to use async_status you should set poll: 0 so that the task does not wait for the job to finish.

Related

What's the fastest method to add Linux users with Ansible?

I have a users.yaml file with information regarding 400+ users. I need Ansible to create these users during provisioning. I tried with the async keyword (if that's the right word to use, tell me if I'm wrong) and poll: 15 but it takes ~10minutes.
- name: Add FTP users asynchronously
ansible.builtin.user:
name: "{{ item.name }}"
home: "{{ item.home }}"
shell: /sbin/nologin
groups: ftp-users
create_home: yes
append: no
loop: "{{ ftp_users }}"
async: 60
poll: 15
tags: users
I also tried using poll:0 but many users aren't created.
Your actual use of async is adapted to a single long running task use case where you want to minimize the chance of getting your connection kicked because of a timeout. You are asking ansible to start a job, disconnect from the target and then reconnect every 15 seconds to check if the job is done (or until you reach the 60 seconds timeout). Nothing will be launched in parallel: the next iteration in the loop will only start when the current is done.
What you want to do instead is run those tasks in parallel as fast as possible and then check back later if they are done. In this case, you have to use poll: 0 on your task and later check for completion with the async_status module as described on the ansible async guide. Note that you also need to cleanup the async job cache as ansible will not do it automagically for you in that case.
In your case, this would give:
- name: Add FTP users asynchronously
ansible.builtin.user:
name: "{{ item.name }}"
home: "{{ item.home }}"
shell: /sbin/nologin
groups: ftp-users
create_home: yes
append: no
loop: "{{ ftp_users }}"
async: 60
poll: 0
register: add_user
- name: Wait until all commands are done
async_status:
jid: "{{ item.ansible_job_id }}"
register: async_poll_result
until: async_poll_result.finished
retries: 60
delay: 1
loop: "{{ add_user.results }}"
- name: clean async job cache
async_status:
jid: "{{ item.ansible_job_id }}"
mode: cleanup
loop: "{{ add_user.results }}"
Meanwhile, although this is a direct answer on how to use async for parallel jobs, I'm not entirely sure this will fix your actual performance problem which could come from other issues (like slow dns, slow network, pipelining not enabled if that is possible, master ssh connection not configured...)

Ansible Downloading large files

I am trying to download backup files from my websites. I have structured my playbook the following:
site_vars.yml holds my variables:
website_backup_download:
- name: ftp://username:userpassword#ftp.mysite1.com/backups/mysite1backup.tgz
path: mysites/backups/www
- name: ftp://username:userpassword#ftp.mysite2.com/backups/mysite2backup.tgz
path: mysites/backups/www
- name: ftp://username:userpassword#ftp.mysite3.com/backups/mysite3backup.tgz
path: mysites/backups/www
Actual downloader playbook:
# Downloader
task:
- name: Download backups from FTP's
get_url:
url: "{{ item.name }}"
dest: "{{ item.path }}"
mode: 0750
no_log: false
ignore_errors: True
with_items:
- "{{ website_backup_download }}"
This works actually very well, but the problem begins with large backup files, the task needs to be running until the backup file has been downloaded properly.
I can't repeat the task to complete the incompleted file or files. :)
Have tried another solution, this works also well for a single site, but can't use it for multiple downloads :(
- name: Download backups
command: wget -c ftp://username:userpassword#ftp.mysite1.com/backups/mysite1backup.tgz
args:
chdir: "{{ down_path }}"
warn: false
register: task_result
retries: 10
delay: 1
until: task_result.rc == 0
ignore_errors: True
Thanks for your help.
I have modified the task by adding the timeout parameter for runtime, additionally added the until parameter, waiting for download to be finished, and retry and delay parameters to retrying until it meths conditions.
This works for now :)
Thanks to all of you.
# Downloader
task:
- name: Download backups from FTP's
get_url:
url: "{{ item.name }}"
dest: "{{ item.path }}"
mode: 0750
timeout: 1800
retries: 10
delay: 3
register: result
until: result is succeeded
no_log: false
ignore_errors: True
with_items:
- "{{ website_backup_download }}"

How do i check if a machine is up and running using an Ansible playbook

i'm trying to write an ansible playbook to check if a set of machines are up and running.
Let's say, I've 5 machines to test. I'm trying to understand if I can have a playbook to capture status(up or down) of all 5 machines by checking one by one sequentially without failing the play if one of the machine is down.
It's possible to use wait_for_connection in the block. For example
- hosts: all
gather_facts: false
tasks:
- block:
- wait_for_connection:
sleep: 1
timeout: 10
rescue:
- debug:
msg: "{{ inventory_hostname }} not connected. End of host."
- meta: clear_host_errors
- meta: end_host
- debug:
msg: "{{ inventory_hostname }} is running"
- setup:

Pause time between hosts in the Ansible Inventory

I am trying the below task in my playbook. but the pause is not executed. i want the play should be paused for 30 sec once after the each host is deleted.
name: delete host from the NagiosXI
shell: curl -k -XDELETE "https://10.000.00.00/nagiosxi/api/v1/config/host?apikey=qdjcwc&pretty=1&host_name={{ item }}&applyconfig=1"
- pause:
seconds: 120
ignore_error: yes
with_items:
- "{{ groups['grp1'] }}"
can someone suggest if this is the right way if doing or propose me the right way. i also used serial=1 module but its still not working.
You can use pause under your loop:
- name: Pause
hosts: all
gather_facts: False
tasks:
- name: delete host from the NagiosXI
shell: curl -k -XDELETE "https://10.000.00.00/nagiosxi/api/v1/config/host?apikey=qdjcwc&pretty=1&host_name={{ item }}&applyconfig=1"
ignore_errors: True
with_items:
- "{{ groups['grp1'] }}"
loop_control:
pause: 120
Unfortunately, applying multiple tasks to with_items is not possible in Ansible at the moment, but is still doable with the include directive. As example,
The main play file would be
---
- hosts: localhost
connection: local
gather_facts: no
remote_user: me
tasks:
- include: sub_play.yml nagios_host={{ item }}
with_items:
- host1
- host2
- host3
The sub_play yml which is included in the main play would be,
---
- shell: echo "{{ nagios_host }}"
- pause:
prompt: "Waiting for {{ nagios_host }}"
seconds: 5
In this case, the include statement is executed over a loop which executes all the tasks in the sub_task yml.

ansible : how to use the variable ${item} from with_items in notify?

I am new to Ansible and I am trying to create several virtual environments (one for each project, the list of projects being defined in a variable).
The task works well, I got all the folders, however the handler does not work, it does not init each folder with the virtual environment. The ${item} varialbe in the handler does not work.
How can I use an handler when I use with_items ?
tasks:
- name: create virtual env for all projects ${projects}
file: state=directory path=${virtualenvs_dir}/${item}
with_items: ${projects}
notify: deploy virtual env
handlers:
- name: deploy virtual env
command: virtualenv ${virtualenvs_dir}/${item}
Handlers are just 'flagged' for execution once whatever (itemized sub-)task requests it (had the changed: yes in its result).
At that time handlers are just like a next regular tasks, and don't know about the itemized loop.
A possible solution is not with a handler but with an extratask + conditional
Something like
- hosts: all
gather_facts: false
tasks:
- action: shell echo {{item}}
with_items:
- 1
- 2
- 3
- 4
- 5
register: task
- debug: msg="{{item.item}}"
with_items: task.results
when: item.changed == True
To sum up the previous discussion and adjusting for the modern Ansible...
- hosts: localhost,
gather_facts: false
tasks:
- action: shell echo {{item}} && exit {{item}}
with_items:
- 1
- 2
- 3
- 4
- 5
register: task
changed_when: task.rc == 3
failed_when: no
notify: update service
handlers:
- name: update service
debug: msg="updated {{item}}"
with_items: >
{{
task.results
| selectattr('changed')
| map(attribute='item')
| list
}}

Resources