Ansible - how are tasks executed on host machine? Manually executing a failed task on host machine to see the output - ansible

One of my ansible tasks is failing without an error trace.The only message was that there was a non-zero return code (caused due to set -e ?) on running the bitbucket diy backup script. This was running earlier but is failing now after making some changes (enabling imdsv2) on the ec2 bitbucket server.
It was recommended that I manually run the task on the host to see if I can get some output, but I find myself needing to recreate the entire directory of shell scripts on the host machine to do that.
So I have 2 questions -
Is there a better way to approach this?
How would I be able to run the task from host machine from the point of failure? Some tasks before it were successful, would ansible create a directory for running tasks? And I can continue there?
Sorry I am really new to ansible, and tried going over the docs, but couldn't find a way to debug this stuff properly.
The debug log on ansible doesn't give any helpful output -
<ip-addr> (1, b'\\r\\n{"changed": true, "stdout": "", "stderr": "", "rc": 22, "cmd": ["/apps/bitbucket/atlassian-bitbucket-diy-backup/bitbucket.diy-backup.sh"], "start": "2023-02-16 22:31:18.187412", "end": "2023-02-16 22:31:18.280648", "delta": "0:00:00.093236", "failed": true, "msg": "non-zero return code", "invocation": {"module_args": {"_raw_params": "/apps/bitbucket/atlassian-bitbucket-diy-backup/bitbucket.diy-backup.sh", "_uses_shell": false, "warn": false, "stdin_add_newline": true, "strip_empty_ends": true, "argv": null, "chdir": null, "executable": null, "creates": null, "removes": null, "stdin": null}}}\\r\\n', b'Shared connection to ip-addr closed.\\r\\n')
<ip-addr> Failed to connect to the host via ssh: Shared connection to ip-addr closed.
<ip-addr> ESTABLISH SSH CONNECTION FOR USER: ubuntu
<ip-addr> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="ubuntu"' -o ConnectTimeout=10 -o 'ControlPath="/runner/cp/72bdb94d8e"' ip-addr '/bin/sh -c '"'"'rm -f -r /home/ubuntu/.ansible/tmp/ansible-tmp-1676547077.8146338-376-91399646558620/ > /dev/null 2>&1 && sleep 0'"'"''
<ip-addr> (0, b'', b'')
fatal: [ip-addr]: FAILED! => {
"changed": true,
"cmd": [
"/apps/bitbucket/atlassian-bitbucket-diy-backup/bitbucket.diy-backup.sh"
],
"delta": "0:00:00.093236",
"end": "2023-02-16 22:31:18.280648",
"invocation": {
"module_args": {
"_raw_params": "/apps/bitbucket/atlassian-bitbucket-diy-backup/bitbucket.diy-backup.sh",
"_uses_shell": false,
"argv": null,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"stdin_add_newline": true,
"strip_empty_ends": true,
"warn": false
}
},
"msg": "non-zero return code",
"rc": 22,
"start": "2023-02-16 22:31:18.187412",
"stderr": "",
"stderr_lines": [],
"stdout": "",
"stdout_lines": []
}
PLAY RECAP *********************************************************************
ip-addr : ok=13 changed=2 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
So I tried manually running the script on the host machine but I'm not sure I did it right. I expected an error trace for the bitbucket diy backup script, added some echo statements to see where it went and doesn't output anything after this line (should ideally output debug)
set -e
echo "start" # manually added
SCRIPT_DIR=$(dirname "$0")
source "${SCRIPT_DIR}/utils.sh"
echo "debug" #manually added

Related

sshfs with ansible does not give the same result as running it manually on the host

I am working on backups for my server. I am using sshfs for this. When wanting to back up a folder the backup server asks for a password. This is what my task (handler) looks like:
- name: Mount backup folder
become: yes
expect:
command: "sshfs -o allow_other,default_permissions {{ backup_server.user }}#{{ backup_server.host }}:/ /mnt/backup"
echo: yes
responses:
(.*)password(.*): "{{ backup_server.password }}"
(.*)Are you sure you want to continue(.*): "yes"
listen: mount-backup-folder
It runs and produces this output:
changed: [prod1.com] => {
"changed": true,
"cmd": "sshfs -o allow_other,default_permissions user#hostname.com:/ /mnt/backup",
"delta": "0:00:00.455753",
"end": "2021-01-26 14:57:34.482440",
"invocation": {
"module_args": {
"chdir": null,
"command": "sshfs -o allow_other,default_permissions user#hostname.com:/ /mnt/backup",
"creates": null,
"echo": true,
"removes": null,
"responses": {
"(.*)Are you sure you want to continue(.*)": "yes",
"(.*)password(.*)": "password"
},
"timeout": 30
}
},
"rc": 0,
"start": "2021-01-26 14:57:34.026687",
"stdout": "user#hostname.com's password: ",
"stdout_lines": [
"user#hostname.com's password: "
]
}
But when I go to the server the folder is not synced with the backup server. BUT when I run the command manually:
sshfs -o allow_other,default_permissions user#hostname.com:/ /mnt/backup
The backup DOES work. Does anybody know how this is possible?
I suspect sshfs was killed by SIGHUP. I know nothing about Ansible so don't know if it has the official way to ignore SIGHUP. As a workaround you can write like this:
expect:
command: bash -c "trap '' HUP; sshfs -o ..."
I installed sshfs and verified this bash -c "trap ..." workaround with Expect (spawn -ignore HUP ...) and sexpect (spawn -nohup ...). I believe it'd also work with Ansible (seems like its expect module uses Python's pexpect).

using Ansible nxos_config, I want to delete a file from bootflash: and sending a 'y' but I'm getting an error

I wonder if you can assist please?
I'm writing a simple Ansible play to delete a file from Nexus 3k's bootflash
when I issue the command locally:
N3K# del bootflash:1.txt
Do you want to delete "/1.txt" ? (yes/no/abort) [y]
therefore I'm sending a "y' in the playbook
---
- name: Upgrading Nexus
connection: network_cli
hosts: n3k
vars:
to_delete: '*.txt'
tasks:
- name: delete a file
nxos_config:
commands:
- del bootflash:1.txt
- echo 'y'
When I run this play, I get:
TASK [delete a file] **********************************************************************************************************************************************************************
task path: /etc/ansible/brrrr.yml:8
<el-cagcc00-01mnl03> ESTABLISH LOCAL CONNECTION FOR USER: root
<el-cagcc00-01mnl03> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp/ansible-local-25838fJQR1e/ansible-tmp-1573503060.87-17981903721769 `" && echo ansible-tmp-1573503060.87-17981903721769="` echo /root/.ansible/tmp/ansible-local-25838fJQR1e/ansible-tmp-1573503060.87-17981903721769 `" ) && sleep 0'
Using module file /usr/lib/python2.7/site-packages/ansible/modules/network/nxos/nxos_config.py
<el-cagcc00-01mnl03> PUT /root/.ansible/tmp/ansible-local-25838fJQR1e/tmpxuqVCz TO /root/.ansible/tmp/ansible-local-25838fJQR1e/ansible-tmp-1573503060.87-17981903721769/AnsiballZ_nxos_config.py
<el-cagcc00-01mnl03> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-local-25838fJQR1e/ansible-tmp-1573503060.87-17981903721769/ /root/.ansible/tmp/ansible-local-25838fJQR1e/ansible-tmp-1573503060.87-17981903721769/AnsiballZ_nxos_config.py && sleep 0'
<el-cagcc00-01mnl03> EXEC /bin/sh -c '/usr/bin/python /root/.ansible/tmp/ansible-local-25838fJQR1e/ansible-tmp-1573503060.87-17981903721769/AnsiballZ_nxos_config.py && sleep 0'
<el-cagcc00-01mnl03> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-local-25838fJQR1e/ansible-tmp-1573503060.87-17981903721769/ > /dev/null 2>&1 && sleep 0'
The full traceback is:
WARNING: The below traceback may *not* be related to the actual failure.
File "/tmp/ansible_nxos_config_payload_i5IR9V/ansible_nxos_config_payload.zip/ansible/module_utils/network/nxos/nxos.py", line 187, in load_config
resp = connection.edit_config(config, replace=replace)
File "/tmp/ansible_nxos_config_payload_i5IR9V/ansible_nxos_config_payload.zip/ansible/module_utils/connection.py", line 186, in __rpc__
raise ConnectionError(to_text(msg, errors='surrogate_then_replace'), code=code)
fatal: [el-cagcc00-01mnl03]: FAILED! => {
"changed": false,
"invocation": {
"module_args": {
"after": null,
"auth_pass": null,
"authorize": null,
"backup": false,
"backup_options": null,
"before": null,
"commands": [
"del bootflash:1.txt",
"echo 'y'"
],
"defaults": false,
"diff_against": null,
"diff_ignore_lines": null,
"host": null,
"intended_config": null,
"lines": [
"del bootflash:1.txt",
"echo 'y'"
],
"match": "line",
"parents": null,
"password": null,
"port": null,
"provider": null,
"replace": "line",
"replace_src": null,
"running_config": null,
"save_when": "never",
"src": null,
"ssh_keyfile": null,
"timeout": null,
"transport": null,
"use_ssl": null,
"username": null,
"validate_certs": null
}
},
"msg": "timeout value 30 seconds reached while trying to send command: del bootflash:1.txt"
}
PLAY RECAP ********************************************************************************************************************************************************************************
el-cagcc00-01mnl03 : ok=1 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
Please advise
Thanx in advance
Try this
tasks:
- name: delete a file
nxos_command:
commands:
- delete bootflash:///1.txt no prompt

Print out the task on which a host has failed at the end of a playbook?

I am running the deployment concurrently on a number of hosts. As can be expected, the output moves quickly during runtime and it is hard to track at what state each task ends. When I get to the end of the playbook I can see which host have failed which is great. However, I need to scroll through pages and pages of output in order to find out on which task did a certain host fail.
Is there a way to have a print out at the end of the playbook saying for example:
"Host 1 failed on Task 1/cmd"
Don't know if this fits your issue exactly, but you can help yourself with a little exception handling like this:
---
- hosts: localhost
any_errors_fatal: true
tasks:
- block:
- name: "Fail a command"
shell: |
rm someNonExistentFile
rescue:
- debug:
msg: "{{ ansible_failed_result }}"
#- fail:
# msg: "Playbook run failed. Aborting..."
# uncomment this failed section to actually fail a deployment after writing the error message
The variable ansible_failed_result contains something like this:
TASK [debug] ************************************************************************************************************************************************************************************************
ok: [localhost] => {
"msg": {
"changed": true,
"cmd": "rm someNonExistentFile\n",
"delta": "0:00:00.036509",
"end": "2019-10-31 12:06:09.579806",
"failed": true,
"invocation": {
"module_args": {
"_raw_params": "rm someNonExistentFile\n",
"_uses_shell": true,
"argv": null,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"stdin_add_newline": true,
"strip_empty_ends": true,
"warn": true
}
},
"msg": "non-zero return code",
"rc": 1,
"start": "2019-10-31 12:06:09.543297",
"stderr": "rm: cannot remove ‘someNonExistentFile’: No such file or directory",
"stderr_lines": [
"rm: cannot remove ‘someNonExistentFile’: No such file or directory"
],
"stdout": "",
"stdout_lines": [],
"warnings": [
"Consider using the file module with state=absent rather than running 'rm'. If you need to use command because file is insufficient you can add 'warn: false' to this command task or set 'command_warnings=False' in ansible.cfg to get rid of this message."
]
}
}
I mostly use stderr when applicable. Otherwise I use "{{ ansible_failed_result | to_nice_json }}".
hth

sshfs is not mounting the directory using ansible playbook

I wrote sshfs ansible playbook to mount remote directory from server.
When i am executing the same command manually on shell it is working(remote directory contents are seen). But when i am trying with ansible playbook, the remote directory is not mounting as expected.
user_allow_other -> Added this line /etc/fuse.conf
Added the below lines: /etc/ssh/ssh_config
SendEnv LANG LC_*
HashKnownHosts yes
GSSAPIAuthentication yes
GSSAPIDelegateCredentials no
StrictHostKeyChecking no
With out these addition also, running manually it is working.
But ansible playbook it is not mounting the remote directory, but showing as playbook successful.
**fuse.yml**
---
- hosts: server
become: yes
tasks:
- name: Mount Media Directory
shell: echo root123 | sshfs -o password_stdin,reconnect,nonempty,allow_other,idmap=user stack#10.1.1.1:/home/stack /mnt/server1
root#stack-VirtualBox:~/playbook# ansible-playbook fusessh.yml -vvv
<10.1.1.1> ESTABLISH SSH CONNECTION FOR USER: stack
<10.1.1.1> SSH: EXEC sshpass -d10 ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o 'IdentityFile="/home/stack/.ssh/id_rsa"' -o User=stack -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/88bbb1646b 10.1.1.1 '/bin/sh -c '"'"'rm -f -r /tmp/stack/ansible/ansible-tmp-1568065019.557693-124891815649027/ > /dev/null 2>&1 && sleep 0'"'"''
<10.1.1.1> (0, b'', b'')
changed: [10.1.1.1] => {
"changed": true,
"cmd": "echo root123 | sshfs -o password_stdin,reconnect,nonempty,allow_other,idmap=user stack#10.1.1.1:/home/stack /mnt/server1",
"delta": "0:00:00.548757",
"end": "2019-09-09 15:37:00.579023",
"invocation": {
"module_args": {
"_raw_params": "echo root123 | sshfs -o password_stdin,reconnect,nonempty,allow_other,idmap=user stack#10.1.1.1:/home/stack /mnt/server1",
"_uses_shell": true,
"argv": null,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"warn": true
}
},
"rc": 0,
"start": "2019-09-09 15:37:00.030266",
"stderr": "",
"stderr_lines": [],
"stdout": "",
"stdout_lines": []
}
META: ran handlers
META: ran handlers
PLAY RECAP ************************************************************************************************
10.1.1.1 : ok=2 changed=1 unreachable=0 failed=0
The sshfs action is performed on remote instead of locally. The reason why it is working manually because the sshs login action is performed on the local shell not on remote server. I modified your playbook by adding local_action. I have tested the same and it is working fine.
---
- hosts: server
become: yes
tasks:
- name: Mount Media Directory
local_action: shell echo root123 | sshfs -o password_stdin,reconnect,nonempty,allow_other,idmap=user stack#10.1.1.1:/home/stack /mnt/server1

Getting Memory Error in Ansible when using command task in playbook

- hosts: all
ignore_errors: yes
tasks:
- name: Install BKUP
command: yes | var/tocopy/Client/install
error message:
Traceback (most recent call last): File
"/tmp/ansible_HXcBpN/ansible_modlib.zip/ansible/module_utils/basic.py",
line 2817, in run_command
stdout += self._read_from_pipes(rpipes, rfds, cmd.stdout) MemoryError
fatal: []: FAILED! => {
"changed": false,
"cmd": "yes '|' var/tocopy/Client/install",
"invocation": {
"module_args": {
"_raw_params": "yes | var/tocopy/Client/install",
"_uses_shell": false,
"chdir": null,
"creates": null,
"executable": null,
"removes": null,
"stdin": null,
"warn": true
}
},
"msg": "",
"rc": 257 } ...ignoring META: ran handlers META: ran handlers
The play
command: yes | var/tocopy/Client/install
never terminates, when your script var/tocopy/Client/install does not terminate, because yes as the man page states
yes - output a string repeatedly until killed
never gets killed. The memory error is a subsequent error, because the output gets somewhere buffered and eats up all your memory.
So use an other command that terminates like
command: echo y | var/tocopy/Client/install
If you need to input the string y to your script better use the expect module.

Resources