Ansible: Test if SSH login possible without FATAL error? - ansible

I have a setup playbook that takes a freshly installed linux instance, logs in as the default user (we'll call user1), creates another user (we'll call user2), then disables user1. Because user1 can only access the instance before this set of tasks is executed, the tasks are in a special playbook we have to remember to run on new instances. After that, all the common tasks are run by user2 because user1 no longer exists.
I want to combine the setup and common playbooks so we don't have to run the setup playbook manually anymore. I tried to create a task to see which user exists on the instance to make the original setup tasks conditional by attempting to login via SSH as user1. The problem is that if I try the SSH login for either user, ansible exits with a FATAL error because it can't login: user2 doesn't exist yet on new instances or user1 has been disabled after the setup playbook executes.
I believe testing the login via SSH is the only way to determine externally what condition the instance is in. Is there a way to test SSH logins without getting a FATAL error to then execute tasks conditionally based on the results?

One approach would be to use shell via a local_action to invoke a simple ssh command to user1 and see if it succeeds or not. Something along these lines:
- name: Test for user1
local_action: shell ssh user1#{{ inventory_hostname }} "echo success"
register: user1_enabled
Then you could use something like this in another task to see if it worked:
when: user1_enabled.stdout.find("success") != -1

With Ansible >= 2.5 it is possible to use the wait_for_connection_module (https://docs.ansible.com/ansible/2.5/modules/wait_for_connection_module.html).
- name: Wait 600 seconds for target connection to become reachable/usable
wait_for_connection:

Related

Prompting for password during Ansible playbook running

I am attempting to write an ansible playbook to send out to new colleagues that would automate a few of the tasks that they would need to do once getting a new laptop (installing standard programs and connecting to OpenVpn etc) I am stuck at how I would get them to input their password at the time of starting the VPN service. If I use the --ask or -k options at the beginning of the playbook running it (the password) will likely time-out before the playbook gets to starting the service. The passwords are using a 2FA type password where they put in their PIN and then use a token to generate the second half of their password so it will time out after a set period. I thought of vars_prompt but unsure where it would be passed in and if it would work. I am using somthing like this:
- name: start and enable openvpn
ansible.builtin.service:
name: "{{ openvpn_service }}"
state: started
enabled: yes
but of course it times out as there isnt a prompt for a password. Happy for any advice or help
You can ask for a password during the script with a specific task:
- name: "Ask for credentials"
pause:
prompt: "Enter VPN 2FA: "
echo: no
register: 2fa_password
Then in the VPN Connection task, you can use the result with 2fa_password.user_input.

Not able to switch user in ansible

We have a system where we have user A and user B. We can switch to user B from A using "sudo su" command only. Direct login to B user is not allowed.
Now From Ansible master, we can login to A user (as ansible remote user) successfully. Our use case is, We have to run some commands as user B using ansible. But we are failing to switch to B user and run those commands.
Our yml file looks like -
Module to copy java to the target host.
- name: Copying Java jdk1.8.0_192
remote_user: A
become_user: B
become: true
become_method: su
copy:
src: /etc/ansible/jboss7-cluster/raw_template/jdk1.8.0_192.zip
dest: "{{ java_install_dir }}"
Any inputs?
In your case, I believe the become_method should be the default sudo. Have you tried using that? If so, what is the result? Can you copy/paste the result here?
Also, can you try to run an ad hoc command against the host, and post the result here?
Something like this:
ansible -i inventory.ini -u A --become --become-user B -m ping myhost
And one more thing: note that there are some restrictions when using become to switch to a non-privileged user:
"In addition to the additional means of doing this securely, Ansible 2.1 also makes it harder to unknowingly do this insecurely. Whereas in Ansible 2.0.x and below, Ansible will silently allow the insecure behaviour if it was unable to find another way to share the files with the unprivileged user, in Ansible 2.1 and above Ansible defaults to issuing an error if it can’t do this securely. If you can’t make any of the changes above to resolve the problem, and you decide that the machine you’re running on is secure enough for the modules you want to run there to be world readable, you can turn on allow_world_readable_tmpfiles in the ansible.cfg file. Setting allow_world_readable_tmpfiles will change this from an error into a warning and allow the task to run as it did prior to 2.1."
And just as a side note: please avoid using sudo su - use sudo -i, or at least sudo su - . These will populate the environment correctly, unlike sudo su. For a fun read about why you want this, see here.

How to switch out of root acount during server set up?

I need to automate the deployment of some remote Debian servers. These servers come with only the root account. I wish to make it such that the only time I ever need to login as root is during the set up process. Subsequent logins will be done using a regular user account, which will be created during the set up process.
However during the set up process, I need to set PermitRootLogin no and PasswordAuthentication no in /etc/ssh/sshd_config. Then I will be doing a service sshd restart. This will stop the ssh connection because ansible had logged into the server using the root account.
My question is: How do I make ansible ssh into the root account, create a regular user account, set PermitRootLogin no and PasswordAuthentication no, then ssh into the server using the regular user account and do the remaining set up tasks?
It is entirely possible that my set-up process is flawed. I will appreciate suggestions.
You can actually manage the entire setup process with Ansible, without requiring manual configuration prerequisites.
Interestingly, you can change ansible_user and ansible_password on the fly, using set_fact. Remaining tasks executed after set_fact will be executed using the new credentials:
- name: "Switch remote user on the fly"
hosts: my_new_hosts
vars:
reg_ansible_user: "regular_user"
reg_ansible_password: "secret_pw"
gather_facts: false
become: false
tasks:
- name: "(before) check login user"
command: whoami
register: result_before
- debug: msg="(before) whoami={{ result_before.stdout }}"
- name: "change ansible_user and ansible_password"
set_fact:
ansible_user: "{{ reg_ansible_user }}"
ansible_password: "{{ reg_ansible_password }}"
- name: "(after) check login user"
command: whoami
register: result_after
- debug: msg="(after) whoami={{ result_after.stdout }}"
Furthermore, you don't have to fully restart sshd to cause configuration changes to take effect, and existing SSH connections will stay open. Per sshd(8) manpage:
sshd rereads its configuration file when it receives a hangup signal, SIGHUP....
So, your setup playbook could be something like:
login initially with the root account
create the regular user and set his password or configure authorized_keys
configure sudoers to allow regular user to execute commands as root
use set_fact to switch to that account for the rest of the playbook (remember to use become: true on tasks after this one, since you have switched from root to regular user. you might even try executing a test sudo command before locking root out)
change sshd configuration
execute kill -HUP<sshd_pid>
verify by setting ansible_user back to root, fail if login works
You probably just want to make a standard user account and add it to sudoers. You could then run ansible with the standard user and if you need a command to run as root, you just prefix with command with sudo.
I wrote an article about setting up a deploy user
http://www.altmake.com/2013/03/06/secure-lamp-setup-on-amazon-linux-ami/

Ansible wait_for for connecting machine to actually login

In my working environment, virtual machines are created and after creating login access information is added to them and there can be delays so just waiting for my ansible script to check if SSH is available is not enough, I actually need to check if ansible can get inside the remote machine via ssh.
Here is my old script which fails me:
- name: wait for instances to listen on port:22
wait_for:
state: started
host: "{{ item }}"
port: 22
with_items: myservers
How can I rewrite this task snippet to achieve waiting for the localmachine can ssh into the remote machines (again not only checking if ssh is ready at the remote but it can actually authenticate to it).
This is somewhat ugly, but given your needs it might work:
- local_action: command ssh myuser#{{ ansible_inventory_hostname }} exit
register: log_output
until: log_output.stdout.find("Last login") > -1
retries: 10
delay: 5
The first line would cause your ansible host to try to ssh into the target host and immediately issue an "exit" to return control back to ansible. Any output from that command gets stored in the log_output variable. The until clause will check the output for the string 'Last login' (you may want to change this to something else depending on your environment), and Ansible will retry this task up to 10 times with a 5 second delay between attempts.
Bruce P's answer was close to what I needed, but my ssh doesn't print any banner when running a command, so checking stdout is problematic.
- local_action: command ssh "{{ hostname }}" exit
register: ssh_test
until: ssh_test.rc == 0
retries: 25
delay: 5
So instead I use the return code to check for success
As long as your Ansible user is already installed on the image you are using to create the new server instance, the wait_for command works well.
If that is not the case, then you need to poll the system that adds that user to the newly created instance for when you should continue - of course that system will have to have something to poll against...
The (very ugly) alternative is to put a static pause in your script that will wait the appropriate amount of time between the instance being created and the user being added like so:
- pause: seconds=1
Try not to though, static pauses are a bad way of solving this issue.

Create and use group without restart

I have a task, that creates a group.
- name: add user to docker group
user: name=USERNAME groups=docker append=yes
sudo: true
In another playbook I need to run a command that relies on having the new group permission. Unfortunately this does not work because the new group is only loaded after I logout and login again.
I have tried some stuff like:
su -l USERNAME
or
newgrp docker; newgrp
But nothing worked. Is there any change to force Ansible to reconnect to the host and does a relogin? A reboot would be the last option.
You can use an (ansible.builtin.)meta: reset_connection task:
- name: add user to docker group
ansible.builtin.user:
name: USERNAME
groups: docker
append: yes
- name: reset ssh connection to allow user changes to affect ansible user
ansible.builtin.meta:
reset_connection
Note that you can not use a variable to only run the task when the ansible.builtin.user task did a change as “reset_connection task does not support when conditional”, see #27565.
The reset_connection meta task was added in Ansible 2.3, but remained a bit buggy until excluding v2.5.8, see #27520.
For Ansible 2 I created a Galaxy role: https://galaxy.ansible.com/udondan/ssh-reconnect/
Usage:
- name: add user to docker group
user: name=USERNAME groups=docker append=yes
sudo: true
notify:
- Kill all ssh connections
If you immediately need the new group you can either call the module yourself:
- name: Kill own ssh connections
ssh-reconnect: all=True
Or alternatively fire the handlers when required
- meta: flush_handlers
For Ansible < 1.9 see this answer:
Do you use ssh control sockets? If you have ControlMaster activated in your ssh config, this would explain the behavior. Ansible re-connects for every task, so the user should have the correct role assigned on the next task. Though when you use ssh session sharing, Ansible would of course re-use the open ssh connection and therefore result in not logging in again.
You can deactivate the session sharing in your ansible.cfg:
[ssh_connection]
ssh_args= -S "none"
Since session sharing is a good thing to speed up Ansible plays, there is an alternative. Run a task which kills all ssh connections for your current user.
- name: add user to docker group
user: name=USERNAME groups=docker append=yes
sudo: true
register: user_task
- name: Kill open ssh sessions
shell: "ps -ef | grep sshd | grep `whoami` | awk '{print \"kill -9\", $2}' | sh"
when: user_task | changed
failed_when: false
This will force Ansible to re-login at the next task.
Another option I've found would be to use async: to queue up killing sshd in the background, without relying on an open connection. It feels incredibly hacky, but it seems to work reliably in both Ansible 1.9 and 2.0.
- name: Kill SSH
shell: sleep 1; pkill -u {{ ansible_ssh_user }} sshd
async: 3
poll: 2
Pause for 1 second, then kill sshd. Start checking for the job to be finished after 2 seconds, maximum allowed time is 3 seconds. In my limited testing, it seems to solve the problem of refreshing the current user's groups with only a minimal delay.
Try to remove socket folder during the play, it works on my side (I don't know if it's the thinest solution). Oddly, meta: reset_connection is not working with Ansible 2.4
- name: reset ssh connection
local_action:
module: file
path: "~/.ansible/cp"
state: absent

Resources