Tracing to see where Ansible hangs - ansible

My Ansible tasks hangs. I use -vvvv, but nevertheless I can't see any useful information.
<coffee-and-sugar.club> ESTABLISH SSH CONNECTION FOR USER: root
<coffee-and-sugar.club> SSH: EXEC ssh -vvv -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="root"' -o ConnectTimeout=10 -o ControlPath=/home/guettli/.ansible/cp/544631aae4 -tt coffee-and-sugar.club '/bin/sh -c '"'"'/usr/bin/python3 /root/.ansible/tmp/ansible-tmp-1608394831.3465264-205483640933119/AnsiballZ_pip.py && sleep 0'"'"''
What can I do to see what is going on?
Is there a way to enable tracing (like set -x in a shell script)?

You can execute the python script on the remote server by hand. In my case this revealed the root-cause.
Example:
ssh root#remote
# /usr/bin/python3 /root/.ansible/tmp/ansible-tmp-1608394831.3465264-205483640933119/AnsiballZ_pip.py
The authenticity of host 'github.com (140.82.121.3)' can't be established.
RSA key fingerprint is SHA256:nThbg6kXUpJWGl7E1IGOCspRomTxdCARLviKw6E5SY8.
Are you sure you want to continue connecting (yes/no/[fingerprint])?

Related

How do i fix the Broken pipe error during Ansible Play

I am getting this below error while running my Ansible play. It was working perfectly fine till couple of days ago and suddenly started happening for this particular host. I don't know if some configuration change happened on this server but any idea what could be wrong?
The same play works fine for other environment like Prod.
Command
ansible-playbook -i my-inventory my-main.yml --tags=copyRepo-e my_release_version=5.0.0-4 -e target_env=preprod --ask-become-pass
I am able to ssh as well
server1 | success >> {
"changed": false,
"ping": "pong"
}
Error
<server1> ESTABLISH CONNECTION FOR USER: user1
<server1> REMOTE_MODULE file state=directory path=/opt/tomcat/releases/Release5.0.0-4/advancederrorsearch/app
<server1> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/home/user1/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 server1 /bin/sh -c 'mkdir -p /tmp/ansible-tmp-1586344775.71-121508053477718 && chmod a+rx /tmp/ansible-tmp-1586344775.71-121508053477718 && echo /tmp/ansible-tmp-1586344775.71-121508053477718'
<server1> PUT /tmp/tmp1enuT2 TO /tmp/ansible-tmp-1586344775.71-121508053477718/file
<server1> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/home/user1/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 server1 /bin/sh -c 'chmod a+r /tmp/ansible-tmp-1586344775.71-121508053477718/file'
<server1> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/home/user1/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 server1 /bin/sh -c 'sudo -k && sudo -H -S -p "[sudo via ansible, key=othmymdswpmqvimfnuimdtsuqtdboprm] password: " -u tomcat /bin/sh -c '"'"'echo BECOME-SUCCESS-othmymdswpmqvimfnuimdtsuqtdboprm; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /tmp/ansible-tmp-1586344775.71-121508053477718/file'"'"''
<server1> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/home/user1/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 server1 /bin/sh -c 'rm -rf /tmp/ansible-tmp-1586344775.71-121508053477718/ >/dev/null 2>&1'
failed: [server1] => (item=advancederrorsearch) => {"failed": true, "item": "advancederrorsearch", "parsed": false}
BECOME-SUCCESS-othmymdswpmqvimfnuimdtsuqtdboprm
couldn't set locale correctly
couldn't set locale correctly
debug1: mux_client_request_session: master session id: 2
debug3: mux_client_read_packet: read header failed: Broken pipe
debug2: Received exit status from master 0
Shared connection to server1 closed.

Ansible hangs on action in handler, but works fine with action in task (reloading pf)

I'm attempting to reload pf as part of a role to provision a FreeBSD server after copying a new pf.conf to the system. When I do this step independently as a task as part of it's own playbook, it works flawlessly. However, when I have exactly the same action as a handler, ansible always hangs during the execution of that handler.
The play that succeeds:
- hosts: tag_Name_web ; all ec2 instances tagged with web
gather_facts: True
vars:
ansible_python_interpreter: /usr/local/bin/python2.7
ansible_become_pass: xxx
tasks:
- name: copy pf.conf
copy:
src: pf.template
dest: /etc/pf.conf
become: yes
become_method: su
- name: reload pf
shell: /sbin/pfctl -f /etc/pf.conf
become: yes
become_method: su
- name: echo
shell: echo "test"
become: yes
become_method: su
(I included the echo as a test, as I thought it might be succeeding because the reload was the last thing the play was doing, but it works fine).
The handler, which fails is:
# handlers file for jail_host
- name: Start iocage
command: service iocage start
- name: Reload sshd
service: name=sshd state=reloaded
- name: Reload pf
shell: "/sbin/pfctl -f /etc/pf.conf"
The handler definitely gets called, and it starts to work, and then it just hangs. (When I run pfctl -sa on the system, it shows me that the new pf.conf was actually reloaded. So it's working, it's just never returning and therefore making the rest of the ansible run not happen).
Below is the debug output of the handler running, but I don't see any errors that I can make sense of. There is no timeout as far as I can tell; I've let it run for 30 minutes before I Ctrl-C.
RUNNING HANDLER [JoergFiedler.freebsd-jail-host : Reload pf] *******************
Using module file /usr/local/lib/python2.7/site-packages/ansible/modules/core/commands/command.py
<54.244.77.100> ESTABLISH SSH CONNECTION FOR USER: ec2-user
<54.244.77.100> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o 'IdentityFile="/usr/local/etc/ansible/xxx_aws.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=ec2-user -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r 54.244.77.100 '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo ~/.ansible/tmp/ansible-tmp-1487698172.0-93173364920700 `" && echo ansible-tmp-1487698172.0-93173364920700="` echo ~/.ansible/tmp/ansible-tmp-1487698172.0-93173364920700 `" ) && sleep 0'"'"''
<54.244.77.100> PUT /tmp/tmpBrFVdu TO /home/ec2-user/.ansible/tmp/ansible-tmp-1487698172.0-93173364920700/command.py
<54.244.77.100> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o 'IdentityFile="/usr/local/etc/ansible/xxx_aws.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=ec2-user -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r '[54.244.77.100]'
<54.244.77.100> ESTABLISH SSH CONNECTION FOR USER: ec2-user
<54.244.77.100> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o 'IdentityFile="/usr/local/etc/ansible/xxx_aws.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=ec2-user -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r 54.244.77.100 '/bin/sh -c '"'"'chmod u+x /home/ec2-user/.ansible/tmp/ansible-tmp-1487698172.0-93173364920700/ /home/ec2-user/.ansible/tmp/ansible-tmp-1487698172.0-93173364920700/command.py && sleep 0'"'"''
<54.244.77.100> ESTABLISH SSH CONNECTION FOR USER: ec2-user
<54.244.77.100> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o 'IdentityFile="/usr/local/etc/ansible/xxx_aws.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=ec2-user -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r -tt 54.244.77.100 '/bin/sh -c '"'"'su root -c '"'"'"'"'"'"'"'"'/bin/sh -c '"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-cntrcxqxlwicicvwtinmaadrnzzzujfp; /usr/local/bin/python2.7 /home/ec2-user/.ansible/tmp/ansible-tmp-1487698172.0-93173364920700/command.py; rm -rf "/home/ec2-user/.ansible/tmp/ansible-tmp-1487698172.0-93173364920700/" > /dev/null 2>&1'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"'"''"'"'"'"'"'"'"'"' && sleep 0'"'"''
I've also tried a lot of other ways of reloading pf.. using the service module, using command: service pf reload, and they all have exactly the same effect. I've also attempted to make the handler async, with
- name: Reload pf
shell: "/sbin/pfctl -f /etc/pf.conf"
async: 1
poll: 0
with no change.
Does anyone have an idea as to why my role with the handler fails, while a straightforward play with tasks succeeds? And more importantly, how can I get the handler to work properly?
Thanks in advance!
(I should note I'm using Ansible 2.2.1).
This seems to be more an issue with PF not with ansible, give a try again to your playbook but this time use this on your pf.rules:
pass all
You can indeed also test by login to the instance and just run:
/sbin/pfctl -Fa -f /etc/pf.conf.all
where /etc/pf.conf.all contains pass all, it should not log you out or your current session should remain active.
What probably is happening is that your pf rules are dropping/flushing existing connections when applied therefore your SSH (ansible) hangs.
Maybe you need the following in your handler(s)?
become: yes
become_method: su

ansible unable to connect to centos

When i try connecting to CentOS server, i get following error
boby#hon-pc-01:~/www/ansible $ ansible centos -vvv -i hosts -a "uname -a"
Using /home/boby/www/ansible/ansible.cfg as config file
<root#209.236.74.192:3333> ESTABLISH SSH CONNECTION FOR USER: root
<root#209.236.74.192:3333> SSH: EXEC ssh -C -q -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/home/boby/.ansible/cp/ansible-ssh-%h-%p-%r -tt root#209.236.74.192:3333 'mkdir -p "$( echo $HOME/.ansible/tmp/ansible-tmp-1484629049.5-55764328572466 )" && echo "$( echo $HOME/.ansible/tmp/ansible-tmp-1484629049.5-55764328572466 )"'
root#209.236.74.192:3333 | UNREACHABLE! => {
"changed": false,
"msg": "ERROR! SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue",
"unreachable": true
}
boby#hon-pc-01:~/www/ansible $
I am able to connect Debian server with out any issue
boby#hon-pc-01:~/www/ansible $ ansible ubuntu -vvv -i hosts -a "uname -a"
Using /home/boby/www/ansible/ansible.cfg as config file
<vm705n> ESTABLISH SSH CONNECTION FOR USER: root
<vm705n> SSH: EXEC ssh -C -q -o ControlMaster=auto -o ControlPersist=60s -o Port=3333 -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/home/boby/.ansible/cp/ansible-ssh-%h-%p-%r -tt vm705n 'mkdir -p "$( echo $HOME/.ansible/tmp/ansible-tmp-1484629067.62-202068262196976 )" && echo "$( echo $HOME/.ansible/tmp/ansible-tmp-1484629067.62-202068262196976 )"'
<vm705n> PUT /tmp/tmpWzw_nH TO /root/.ansible/tmp/ansible-tmp-1484629067.62-202068262196976/command
<vm705n> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o Port=3333 -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/home/boby/.ansible/cp/ansible-ssh-%h-%p-%r '[vm705n]'
<vm705n> ESTABLISH SSH CONNECTION FOR USER: root
<vm705n> SSH: EXEC ssh -C -q -o ControlMaster=auto -o ControlPersist=60s -o Port=3333 -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=root -o ConnectTimeout=10 -o ControlPath=/home/boby/.ansible/cp/ansible-ssh-%h-%p-%r -tt vm705n 'LANG=en_IN LC_ALL=en_IN LC_MESSAGES=en_IN /usr/bin/python /root/.ansible/tmp/ansible-tmp-1484629067.62-202068262196976/command; rm -rf "/root/.ansible/tmp/ansible-tmp-1484629067.62-202068262196976/" > /dev/null 2>&1'
vm705n | SUCCESS | rc=0 >>
Linux hon-vpn 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux
boby#hon-pc-01:~/www/ansible $
Here is my hosts file
boby#hon-pc-01:~/www/ansible $ cat hosts
[ubuntu]
vm705n:3333
[centos]
root#209.236.74.192:3333
boby#hon-pc-01:~/www/ansible $
Any idea why it is not working for CentOS 6 server ?
EDIT
I got it fixed. The problem was root# in hosts file. For some reason, the SSH command did not take port 3333 because root# present on host file.
The problem was in hosts file.
boby#hon-pc-01:~/www/ansible $ cat hosts
[ubuntu]
vm705n:3333
[centos]
root#209.236.74.192:3333
boby#hon-pc-01:~/www/ansible $
Replaced root#209.236.74.192:3333 with 209.236.74.192:3333 and it started working.

Ansible not picking up proxy settings

I am trying to run an Ansible job on a remote host. But for that to happen, I need to go through a proxy.
Proxy server is: 142.133.134.161
Proxy port is: 1088
My playbook is simple for now:
---
- hosts: LAB1
tasks:
- name: Copy file
template: src=/tmp/file1 dest=/tmp/file1
My environment file is:
[LAB1]
10.169.99.189
10.169.99.190
My ansible.cfg file is:
Host 10.169.99.*
ProxyCommand nc -x 142.133.134.161:1088 %h %p
But when I run a job, it says "Connection timed out":
[root#vm1 ANSIBLE]# ansible -i /root/ANSIBLE/env/target LAB1 -m ping
10.169.99.190 | FAILED => SSH Error: ssh: connect to host 10.169.99.190 port 22: Connection timed out
while connecting to 10.169.99.190:22
It is sometimes useful to re-run the command using -vvvv, which prints SSH debug output to help diagnose the issue.
10.169.99.189 | FAILED => SSH Error: ssh: connect to host 10.169.99.189 port 22: Connection timed out
while connecting to 10.169.99.189:22
It is sometimes useful to re-run the command using -vvvv, which prints SSH debug output to help diagnose the issue.
When I run this in debug mode:
[root#vm1 ANSIBLE]# ansible -i /root/ANSIBLE/env/target LAB1 -m ping -vvvvv
<10.169.99.190> ESTABLISH CONNECTION FOR USER: msdp
<10.169.99.190> REMOTE_MODULE ping
<10.169.99.189> ESTABLISH CONNECTION FOR USER: msdp
<10.169.99.189> REMOTE_MODULE ping
<10.169.99.190> EXEC sshpass -d8 ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o StrictHostKeyChecking=no -o GSSAPIAuthentication=no -o PubkeyAuthentication=no -o User=msdp -o ConnectTimeout=10 10.169.99.190 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1473612082.62-116308097993503 && echo $HOME/.ansible/tmp/ansible-tmp-1473612082.62-116308097993503'
<10.169.99.189> EXEC sshpass -d9 ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/root/.ansible/cp/ansible-ssh-%h-%p-%r" -o StrictHostKeyChecking=no -o GSSAPIAuthentication=no -o PubkeyAuthentication=no -o User=msdp -o ConnectTimeout=10 10.169.99.189 /bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1473612082.63-269107268980760 && echo $HOME/.ansible/tmp/ansible-tmp-1473612082.63-269107268980760'
10.169.99.189 | FAILED => SSH Error: ssh: connect to host 10.169.99.189 port 22: Connection timed out
while connecting to 10.169.99.189:22
It is sometimes useful to re-run the command using -vvvv, which prints SSH debug output to help diagnose the issue.
10.169.99.190 | FAILED => SSH Error: ssh: connect to host 10.169.99.190 port 22: Connection timed out
while connecting to 10.169.99.190:22
It is sometimes useful to re-run the command using -vvvv, which prints SSH debug output to help diagnose the issue.
This does not indicate that it is using the Proxy. Is that the issue here?
Given your ProxyCommand syntax is correct and you want to include it in the ansible.cfg, the correct syntax would be to add an argument to the ssh_args in the [ssh_connection] section of the file:
[ssh_connection]
ssh_args = -o ForwardAgent=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o ControlMaster=auto -o ControlPersist=60s -o ProxyCommand="nc -x 142.133.134.161:1088 %h %p"

Implied SSH connection arguments in Ansible

When running playbook given in this answer with -vvv I get the following log:
<192.168.1.109> SSH: EXEC ssh -C -q -o PasswordAuthentication=yes
-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null
-o ControlMaster=auto -o ControlPersist=60s
-o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no
-o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
-o PasswordAuthentication=no -o ConnectTimeout=120
-o ControlPath=/Users/techraf/.ansible/cp/ansible-ssh-%h-%p-%r 192.168.1.109
'/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo $HOME/.ansible/tmp/ansible-tmp-1469485176.18-28678795304310 `" && echo ansible-tmp-1469485176.18-28678795304310="` echo $HOME/.ansible/tmp/ansible-tmp-1469485176.18-28678795304310 `" ) && sleep 0'"'"''
The first part of SSH arguments is taken from ansible.cfg present in the current directory (which is what I intended):
[ssh_connection]
ssh_args = -o PasswordAuthentication=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ControlMaster=auto -o ControlPersist=60s
Where does the second part:
-o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no
-o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
-o PasswordAuthentication=no -o ConnectTimeout=120
come from?
My objective is to run this playbook using password authentication, yet the latter group of arguments prevents it.
I have checked the following are cleared/non-existent:
/usr/local/etc/ansible/ansible.cfg (I am running Homebrew-installed Ansible on OS X)
$ANSIBLE_CONFIG environment variable
.ansible.cfg (in the home directory)
I am running Ansible 2.1.0.0.
By the time the connection's getting set up, it doesn't think you have a password set, so it's trying to remove that from the valid negotiation options. See the source for more detail, or ensure ansible_password is set on the host in question.
My objective is to run this playbook using password authentication, yet the latter group of arguments prevents it.
You need to add an additional parameter for Ansible to use password authentication:
-k, --ask-pass ask for connection password
Ansible will then prompt for your password once, then use that password for connecting to all servers in that run.
You generally should avoid using password auth for ssh. Not only is it annoying (you have to type the password in all the time), but it opens up your server to brute-force attacks; even if you block those using other means (e.g. fail2ban), it's still not a great idea. If you don't like having keys authenticate without any password, you can put a password on the keys and decrypt them on boot using an ssh agent.

Resources