Unable to grab success or failure i.e rc of long running Ansible task in a loop - performance

Below is my code that takes about 20 minutes in all. The task takes about 1 minute and the loop runs 20 times so 20X1=20minutes. I want the playbook to fail if any task in shell module failed i.e rc != 0
- name: Check DB connection with DBPING
shell: "java utils.dbping ORACLE_THIN {{ item | trim }}"
with_items: "{{ db_conn_dets }}"
delegate_to: localhost
Inorder, to speedup the tasks I decided to run them in parallel and then grab the success or failure of each tasks like below.
- name: Check DB connection with DBPING
shell: "java utils.dbping ORACLE_THIN {{ item | trim }}"
with_items: "{{ db_conn_dets }}"
delegate_to: localhost
async: 600
poll: 0
register: dbcheck
- name: Result check
async_status:
jid: "{{ item.ansible_job_id }}"
register: _jobs
until: _jobs.finished
delay: 10
retries: 20
with_items: "{{ dbcheck.results }}"
Now !! The task Check DB connection with DBPING completes in less than a minute for the entire loop. However, the problem is that ALL the task Result check fails even if the task Check DB connection with DBPING WOULD EVENTUALLY SUCCEED i.e rc=0
Below is one sample failing task Result check when it should have been successful
failed: [remotehost] (item={'_ansible_parsed': True, '_ansible_item_result': True, '_ansible_no_log': False, u'ansible_job_id': u'234197058177.18294', '_ansible_delegated_vars': {'ansible_delegated_host': u'localhost', 'ansible_host': u'localhost'}, u'started': 1, 'changed': True, 'failed': False, u'finished': 0, u'results_file': u'/home/ansbladm/.ansible_async/234197058177.18294', 'item': u'dbuser mypass dbhost:1521:mysid', '_ansible_ignore_errors': None}) => {
"ansible_job_id": "234197058177.18294",
"attempts": 1,
"changed": false,
"finished": 1,
"invocation": {
"module_args": {
"jid": "234197058177.18294",
"mode": "status"
}
},
"item": {
"ansible_job_id": "234197058177.18294",
"changed": true,
"failed": false,
"finished": 0,
"item": "dbuser mypass dbhost:1521:mysid",
"results_file": "/home/ansbladm/.ansible_async/234197058177.18294",
"started": 1
},
"msg": "could not find job",
"started": 1
}
Can you please let me know what is the issue with my task Result check and how can i make sure it keeps trying till the task Check DB connection with DBPING completes and thereby reports success or failure upon the .rc of shell module?
It will also be great if i could get the the debug module to print all failed shell tasks of Check DB connection with DBPING.
Kindly suggest.

Although this is a partial answer to the set of queries i had. I will accept a complete answer if it comes in the future.
Adding delegate_to: localhost to task Result check inoder to resolve the behavior.
I'm however still waiting for the code to show all failed tasks in the debug.

Related

How do I find which register attribute to use in Ansible?

I am testing out the results of the register commands and it yields different attribute for various tasks: failures, msg, stderr, err..etc
- yum:
name: packagenotfound
state: present
ignore_errors: yes
register: command_result
- debug:
msg: "{{ command_result }}"
ok: [ansible] => {
"msg": {
"changed": false,
"failed": true,
"failures": [
"No package packagenotfound available."
],
"msg": "Failed to install some of the specified packages",
"rc": 1,
"results": []
}}
And
- lvg:
pvs: /dev/sddnotfound
vg: vgdata
ignore_errors: yes
register: command_result
- debug:
msg: "{{ command_result }}"
ok: [ansible] => {
"msg": {
"changed": false,
"failed": true,
"msg": "Device /dev/sddnotfound not found."
}
And
- shell: thiscommandwontwork
ignore_errors: yes
register: command_result
- debug:
msg: "{{ command_result }}"
ok: [ansible] => {
"msg": {
"changed": true,
"cmd": "thiscommandwontwork",
"delta": "0:00:00.002560",
"end": "2020-02-05 04:24:35.297556",
"failed": true,
"msg": "non-zero return code",
"rc": 127,
"start": "2020-02-05 04:24:35.294996",
"stderr": "/bin/sh: thiscommandwontwork: command not found",
"stderr_lines": [
"/bin/sh: thiscommandwontwork: command not found"
],
"stdout": "",
"stdout_lines": []
}
And
- lvol:
lv: lvdata
vg: vgroup
size: 2000M
ignore_errors: yes
register: command_result
- debug:
msg: "{{ command_result }}"
ok: [ansible] => {
"msg": {
"changed": false,
"err": " Volume group \"vgroup\" not found\n Cannot process volume group vgroup\n",
"failed": true,
"msg": "Volume group vgroup does not exist.",
"rc": 5
}
Now if I tried to use when: '"xxx" in command_result.err' with yum task for example, it will result in dict_object not found error.
Is there a way to find out which attribute to use without testing?
Testing is definitely the easiest and fastest way to have a look at the content of your registered var in several situation and to take decisions on how to use it in your playbook.
Meanwhile, there are ways to have a global knowledge of what is returned in your registered var from the documenations:
There is a page on Common modules return values
Modules returning specific values usually document them on each relevant doc page. Here is an example for the stat module
You should also be aware that the global register structure is changed when using a loop by addind a top level results list, as explained in registering variables
Knowing what could be in your register does not mean it will be. Your example mentions the (undocumented...) err attribute for the lvol module, which will only be available for an lvol task in error. You can work around such cases by using tests (like my_register is failed) or defaulting values with the default filter.

Ansible: How do I execute tasks in ansible playbook in parallel with some time gap

Let me try and explain my need:
As part of the regular deployment of our application, we have a SQL script(which would alter tables, add tables or update, etc) which needs to be run on 3 schemas in a region and 5 schemas in another for example. The application is in AWS and the database is Arora db(RDS)- MySQL. This schema can take anywhere between 30 minutes to 3 hours.
This SQL script needs to be run in parallel and with a delay of 2 minutes between each schema run.
This is what I have achieved till now:
A file having DB details- dbdata.yml
---
conn_details:
- { host: localhost, user: root, password: "Password1!" }
- { host: localhost, user: root, password: "Password1!" }
The playbook:
- hosts: localhost
vars:
script_file: "{{ path }}"
vars_files:
- dbdata.yml
tasks:
- name: shell command to execute script in parallel
shell: |
sleep 30s
"mysql -h {{ item.host }} -u {{ item.user }} -p{{ item.password }} < {{ script_file }} >> /usr/local/testscript.log"
with_items: "{{ conn_details }}"
register: sql_query_output
async: 600
poll: 0
- name: Wait for sql execution to finish
async_status:
jid: "{{ item.ansible_job_id }}"
register: _jobs
until: _jobs.finished
delay: 20 # Check every 20 seconds. Adjust as you like.
retries: 10
with_items: "{{ sql_query_output.results }}"
1st part- executes the script in parallel and this also includes a time gap of 30 seconds before each execution.
2nd part- picks the ansible job id from the registered output and checks if the job is completed or not.
Please note: before including the 30 seconds sleep, this playbook was working fine.
We have following erroneous output upon execution:
ansible-playbook parallel_local.yml --extra-vars "path=RDS_script.sql"
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
PLAY [localhost] **************************************************************************************************************************************************************************************************
TASK [Gathering Facts] ********************************************************************************************************************************************************************************************
ok: [localhost]
TASK [sample command- ansible-playbook my_sqldeploy.yml --extra-vars "path=/home/NICEONDEMAND/bsahu/RDS_create_user1.sql"] ****************************************************************************************
changed: [localhost] => (item={u'host': u'localhost', u'password': u'Password1!', u'user': u'root'})
changed: [localhost] => (item={u'host': u'localhost', u'password': u'Password1!', u'user': u'root'})
TASK [Wait for creation to finish] ********************************************************************************************************************************************************************************
FAILED - RETRYING: Wait for creation to finish (10 retries left).
FAILED - RETRYING: Wait for creation to finish (9 retries left).
failed: [localhost] (item={'ansible_loop_var': u'item', u'ansible_job_id': u'591787538842.77844', 'item': {u'host': u'localhost', u'password': u'Password1!', u'user': u'root'}, u'started': 1, 'changed': True, 'failed': False, u'finished': 0, u'results_file': u'/root/.ansible_async/591787538842.77844'}) => {"ansible_job_id": "591787538842.77844", "ansible_loop_var": "item", "attempts": 3, "changed": true, "cmd": "sleep 30s\n\"mysql -h localhost -u root -pPassword1! < RDS_script.sql >> /usr/local/testscript.log\"\n", "delta": "0:00:30.073191", "end": "2019-11-28 17:01:57.632285", "finished": 1, "item": {"ansible_job_id": "591787538842.77844", "ansible_loop_var": "item", "changed": true, "failed": false, "finished": 0, "item": {"host": "localhost", "password": "Password1!", "user": "root"}, "results_file": "/root/.ansible_async/591787538842.77844", "started": 1}, "msg": "non-zero return code", "rc": 127, "start": "2019-11-28 17:01:27.559094", "stderr": "/bin/sh: line 1: mysql -h localhost -u root -pPassword1! < RDS_script.sql >> /usr/local/testscript.log: No such file or directory", "stderr_lines": ["/bin/sh: line 1: mysql -h localhost -u root -pPassword1! < RDS_script.sql >> /usr/local/testscript.log: No such file or directory"], "stdout": "", "stdout_lines": []}
failed: [localhost] (item={'ansible_loop_var': u'item', u'ansible_job_id': u'999397686792.77873', 'item': {u'host': u'localhost', u'password': u'Password1!', u'user': u'root'}, u'started': 1, 'changed': True, 'failed': False, u'finished': 0, u'results_file': u'/root/.ansible_async/999397686792.77873'}) => {"ansible_job_id": "999397686792.77873", "ansible_loop_var": "item", "attempts": 1, "changed": true, "cmd": "sleep 30s\n\"mysql -h localhost -u root -pPassword1! < RDS_script.sql >> /usr/local/testscript.log\"\n", "delta": "0:00:30.120136", "end": "2019-11-28 17:01:58.694713", "finished": 1, "item": {"ansible_job_id": "999397686792.77873", "ansible_loop_var": "item", "changed": true, "failed": false, "finished": 0, "item": {"host": "localhost", "password": "Password1!", "user": "root"}, "results_file": "/root/.ansible_async/999397686792.77873", "started": 1}, "msg": "non-zero return code", "rc": 127, "start": "2019-11-28 17:01:28.574577", "stderr": "/bin/sh: line 1: mysql -h localhost -u root -pPassword1! < RDS_script.sql >> /usr/local/testscript.log: No such file or directory", "stderr_lines": ["/bin/sh: line 1: mysql -h localhost -u root -pPassword1! < RDS_script.sql >> /usr/local/testscript.log: No such file or directory"], "stdout": "", "stdout_lines": []}
PLAY RECAP ********************************************************************************************************************************************************************************************************
localhost : ok=2 changed=1 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
Any suggestions how to overcome this. Thanks in advance for all the help.
My bad. I had a silly mistake which was creating trouble. I needed to remove "" from the line executing the sql file. Reproducing the correct yaml file below
- hosts: localhost
vars:
script_file: "{{ path }}"
vars_files:
- dbdata.yml
tasks:
- name: sample command- ansible-playbook my_sqldeploy.yml --extra-vars "path=/home/NICEONDEMAND/bsahu/RDS_create_user1.sql"
shell: |
sleep 30s
mysql -h {{ item.host }} -u {{ item.user }} -p{{ item.password }} < {{ script_file }} >> /usr/local/testscript.log
with_items: "{{ conn_details }}"
register: sql_query_output
async: 600
poll: 0
- name: Wait for creation to finish
async_status:
jid: "{{ item.ansible_job_id }}"
register: _jobs
until: _jobs.finished
delay: 20 # Check every 5 seconds. Adjust as you like.
retries: 10
with_items: "{{ sql_query_output.results }}"
Thanks all for the help.

Filter Ansible output from lineinfile

I'm attempting to audit my systems via files copied to a single host. The default output is very verbose. I would like to see just the pertinent fields of Ansible log output; that way, over 1000 hosts, I can zero into my problems more quickly. . When my play is successful, I'd like to just see:
ok: u'/mnt/inventory/hostname999'
I have a playbook that looks like this:
- hosts: 'localhost'
name: Playbook for the Audit for our infrastructure.
gather_facts: False
become: no
connection: local
roles:
- { role: network, tags: network }
My network role main.xml file looks like this:
---
- name: find matching pattern files
find:
paths: "/mnt/inventory"
patterns: "hostname*"
file_type: directory
register: subdirs
- name: check files for net.ipv4.ip_forward = 0
no_log: False
lineinfile:
name: "{{ item.path }}/sysctl.conf"
line: "net.ipv4.ip_forward = 0"
state: present
with_items: "{{ subdirs.files }}"
register: conf
check_mode: yes
failed_when: (conf is changed) or (conf is failed)
- debug:
msg: "CONF OUTPUT: {{ conf }}"
But I get log output like this:
ok: [localhost] => (item={u'uid': 0, u'woth': False, u'mtime': 1546922126.0,
u'inode': 773404, u'isgid': False, u'size': 4096, u'roth': True, u'isuid':
False, u'isreg': False, u'pw_name': u'root', u'gid': 0, u'ischr': False,
u'wusr': False, u'xoth': True, u'rusr': True, u'nlink': 12, u'issock':
False, u'rgrp': True, u'gr_name': u'root', u'path':
u'/mnt/inventory/hostname999', u'xusr': True, u'atime': 1546930801.0,
u'isdir': True, u'ctime': 1546922126.0, u'wgrp': False, u'xgrp': True,
u'dev': 51, u'isblk': False, u'isfifo': False, u'mode': u'0555', u'islnk':
False})
Furthermore, my debug message of CONF OUTPUT never shows and I have no idea why not.
I have reviewed https://github.com/ansible/ansible/issues/5564 and other articles but they just seem to refer to items like shell commands that send stuff to stdout, which lineinfile does not.
But I get log output like this:
Then you likely want to use loop_control: with a label: "{{ item.path }}" child:
- lineinfile:
# as before
with_items: "{{ subdirs.files }}"
loop_control:
label: "{{ item.path }}"
which will get you closer to what you want:
ok: [localhost] => (item=/mnt/inventory/hostname999)
Furthermore, my debug message of CONF OUTPUT never shows and I have no idea why not.
The best guess I have for that one is that maybe the verbosity needs to be adjusted:
- debug:
msg: "CONF OUTPUT: {{ conf }}"
verbosity: 0
but it works for me, so maybe there is something else special about your ansible setup. I guess try the verbosity: and see if it helps. Given what you are actually doing with that msg:, you may be much happier with just passing conf directly to debug:
- debug:
var: conf
since it will render much nicer because ansible knows it is a dict, rather that just effectively calling str(conf) which (as you saw above) does not format very nicely.

Ansible async task collecting results: could not find job

I'm trying to fire-and-forget some tasks and collect results after that. Here's my playbook:
---
- hosts: node2
gather_facts: yes
tasks:
- name: 'Some long script no 1 on node2'
shell: "time sleep $[ ( $RANDOM % 20 ) + 20 ]s"
async: 40
poll: 0
register: script1
- name: 'Another long script no 2 on node2'
shell: "time sleep $[ ( $RANDOM % 20 ) + 20 ]s"
async: 40
poll: 0
register: script2
- hosts: node2
tasks:
- name: "Collect results"
async_status:
jid: loop_item.ansible_job_id
loop:
- script1
- script2
loop_control:
loop_var: loop_item
register: async_poll_results
until: async_poll_results.finished
retries: 30
When I run it, I receive following error:
PLAY [node2] ****************************************************************************************************************************
TASK [Gathering Facts] ******************************************************************************************************************
ok: [hostname]
TASK [Some long script no 1 on node2] ***************************************************************************************************
changed: [hostname] => {"ansible_job_id": "814448842231.125544", "changed": true, "finished": 0, "results_file": "/home/external.kamil.sacharczuk/.ansible_async/814448842231.125544", "started": 1}
TASK [Another long script no 2 on node2] ************************************************************************************************
changed: [hostname] => {"ansible_job_id": "586999441005.125616", "changed": true, "finished": 0, "results_file": "/home/external.kamil.sacharczuk/.ansible_async/586999441005.125616", "started": 1}
PLAY [node2] ****************************************************************************************************************************
TASK [Gathering Facts] ******************************************************************************************************************
ok: [hostname]
TASK [Collect results] ******************************************************************************************************************
failed: [hostname] (item=script1) => {"ansible_job_id": "loop_item.ansible_job_id", "attempts": 1, "changed": false, "finished": 1, "loop_item": "script1", "msg": "could not find job", "started": 1}
failed: [hostname] (item=script2) => {"ansible_job_id": "loop_item.ansible_job_id", "attempts": 1, "changed": false, "finished": 1, "loop_item": "script2", "msg": "could not find job", "started": 1}
to retry, use: --limit #xxxxxxxxx
PLAY RECAP ******************************************************************************************************************************
hostname : ok=4 changed=2 unreachable=0 failed=1
Don't really know why I receive this "could not find job". I tried to run this "collect" tasks localy first, than I figured out, that this job results are stored on node2, so I run it there. Tried with or without gathering facts. Tried also use
hostvars['hostname'][loop_item][ansible_job_id]
but this gave me same error as here.
Any help would be much appreciated!
PS. I am running ansible 2.6.1
If someone got here searching for this error message, there is another reason for 'could not find job'. If job was run as become, so should be an async_status. If you try to async_status for become-job without adding become to async_status module, it will fail with this message.
Please try as below with proper quotation::
- hosts: localhost
gather_facts: false
tasks:
- name: 'Some long script no 1 on node2'
shell: "time sleep $[ ( $RANDOM % 20 ) + 20 ]s"
async: 40
poll: 0
register: script1
- name: 'Another long script no 2 on node2'
shell: "time sleep $[ ( $RANDOM % 20 ) + 20 ]s"
async: 40
poll: 0
register: script2
- hosts: localhost
tasks:
- name: "Collect results"
async_status:
jid: "{{ loop_item.ansible_job_id }}"
loop:
- "{{ script1 }}"
- "{{ script2 }}"
loop_control:
loop_var: loop_item
register: async_poll_results
until: async_poll_results.finished
retries: 30

Ansible, loop, register, and stdout

I have a playbook that looks like this:
- hosts: host1
gather_facts: false
tasks:
- name: "Loop"
command: "echo {{ item }}"
with_items: [ 0, 2, 4, 6, 8, 10 ]
register: hello
- debug: "msg={{ hello.results }}"
Everything works correctly, and the output is returned, but there is tons and tons of output. It turns out that this:
- debug: "msg={{ hello.results.1.stdout }}"
does exactly what I want -- just grab the stdout from the command -- but only for one of the six times through the loop.
What I really want/need to do is this:
- debug: "msg={{ hello.results.*.stdout }}"
where it goes into the hello structure, accesses the results entry, goes to each member of that array, and pulls out the stdout value.
Is this possible?
UPDATE
- hosts: host1
gather_facts: false
tasks:
- name: "Loop"
command: "echo {{ item }}"
with_items: [ 0, 2, 4, 6, 8, 10 ]
register: hello
- debug:
msg: "{{item.stdout}}"
with_items: "{{hello.results}}"
is no less verbose than my original example.
TASK [debug] *******************************************************************
ok: [host1] => (item={'_ansible_parsed': True, 'stderr_lines': [], u'cmd': [
u'echo', u'0'], u'end': u'2018-01-02 20:53:08.916774', '_ansible_no_log': False
, u'stdout': u'0', '_ansible_item_result': True, u'changed': True, 'item': 0,
u'delta': u'0:00:00.002137', u'stderr': u'', u'rc': 0, u'invocation': {u'module_
args': {u'warn': True, u'executable': None, u'_uses_shell': False, u'_raw_params
': u'echo 0', u'removes': None, u'creates': None, u'chdir': None, u'stdin': Non
e}}, 'stdout_lines': [u'0'], u'start': u'2018-01-02 20:53:08.914637', 'failed':
False}) => {
"item": {
"changed": true,
"cmd": [
"echo",
"0"
],
"delta": "0:00:00.002137",
"end": "2018-01-02 20:53:08.916774",
"failed": false,
"invocation": {
"module_args": {
"_raw_params": "echo 0",
"_uses_shell": false,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"warn": true
}
},
"item": 0,
"rc": 0,
"start": "2018-01-02 20:53:08.914637",
"stderr": "",
"stderr_lines": [],
"stdout": "0",
"stdout_lines": [
"0"
]
},
"msg": "0"
}
I get 6 copies of the above construct.
It feels like I'm close but I'm still doing something wrong. I see "msg": "0" at the bottom, which is what I want. I just don't want the rest of it.
Solution:
- debug: "msg={{ hello.results | map(attribute='stdout') | join('\n') }}"
Remark:
By default, Ansible will print visible \n two-character sequences instead of wrapping the lines, so either use a callback plugin for a human readable output (example) or verify the method with:
- copy:
content: "{{ hello.results | map(attribute='stdout') | join('\n') }}"
dest: ./result.txt
and check the contents of the result.txt.
I have used the keyword loop to get stdout from all iterations of the previous loop:
loop: "{{ hello | json_query('results[*].stdout') }}"
I find json_query easiest to use in such register-loop situations. Official documentation can be found here ==> json-query-filter
Sure. The ansible website has documentation that explains how to use register in a loop. You just need to iterate over the hello.results array, as in:
- debug:
msg: "{{item.stdout}}"
with_items: "{{hello.results}}"
What about:
- debug: "msg={{ item.stdout }}"
with_items: "{{ hello.results }}"
I think this construct works well enough for my needs.
- hosts: localhost
gather_facts: false
vars:
stuff: [ 0,2,4,6,8,10 ]
tasks:
- name: "Loop"
command: "echo {{ item }}"
with_items: "{{ stuff }}"
register: hello
- debug: "var=hello.results.{{item}}.stdout"
with_sequence: "0-{{stuff|length - 1}}"
I was looking at a similar problem and was confused by getting lots of output when I was expecting a relatively small msg or var from debug:. Turns out most of that output was the 'label' with which Ansible was prefixing each of those small outputs. It being a few years after this question was originally asked I've been using loop rather than with_items; this also has a label: option in loop_control:, so in my case for a similar problem - getting any /etc/passwd entries for users 'alice' or 'bob',
- hosts: all
gather_facts: false
serial: 1 # output easier to read when grouped by host
tasks:
- name: Look for users in /etc/passwd
command: grep {{ item }} /etc/passwd
register: res
ignore_errors: true
loop:
- alice
- bob
- debug:
msg: "{{ item.stdout_lines }}"
when: not item.failed
loop: "{{ res.results }}"
loop_control:
label: "{{ item.item }}"

Resources