IBM Cloud Private Installation: etcd not starting - etcd

I have been following this guide https://www.ibm.com/support/knowledgecenter/SSBS6K_2.1.0/installing/install_containers_CE.html to install ICP in 2 nodes.
While everything is going well I get the following error:
TASK [master : Waiting for Etcd to start] ***************************************************************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "elapsed": 600, "failed": true, "msg": "The Etcd component failed to start. For more details, see https://ibm.biz/etcd-fails."}
I can confirm that there are no firewall issues in the ports.
I tried running the installation using -vvv flag but still not clear what the issue could be.
Any help would be appreciated.

You may want to check the configuration whether it is properly made using the following steps as described in this Github
Sometimes cloud-config might be wrong!
If this did not help you can try this one:
https://github.com/coreos/etcd/issues/4308

I had the same issue.
I solved it with help of this thread https://github.com/kubernetes/kubernetes/issues/54542.
You just need to turn swap off with "swapoff -a".

Related

libssh vs paramiko - ios_facts are different when running Ansible playbooks

I have a playbook I am writing that automates the install of firmware to our fleet of C2960Xs.
I recently moved the location of the Ansible server I am using from my homelab to a VM in Azure. We have security rules on our sites to only allow remote connection on a specific non-default port. After reading, I discovered that means I need to use libssh and not paramiko for remote commands.
I have a specific task I am running, and when running it returns:
fatal: [HOSTNAME]: FAILED! => {
"msg": "The conditional check 'ansible_net_filesystems_info['flash:'].spacefree_kb >
firmware_image_size' failed. The error was: error while evaluating
conditional (ansible_net_filesystems_info['flash:'].spacefree_kb >
firmware_image_size): 'ansible_net_filesystems_info' is undefined" }
I then compared the 'before' and 'after' for what facts are being stored at the beginning of the playbook. I found that using Paramiko, I seem to have a TON more detail compared to the facts being gathered via libssh.
One of which being the variable ansible_net_filesystems_info that my playbook references.
Is there a workaround for this process? The idea being to verify there is free space on the switch before moving an archive to the switch for unzipping.
Under the cisco.ios.ios_facts documentation, you can specify the types of facts you want to pull. I added the line: >gather_subset: all

Ansible Chocolatey failing mysteriously?

I am trying to run an ansible playbook on an Azure VM, but I am running into a strange problem. Attempting to install any software (attempted git, sysinternals, nscp) just doesn't fire.
win_chocolatey:
name: git
state: present
Does not even trigger an install attempt. Nothing in the logs other than attempting to list the software. It just attempts to list the software, and throws out that it's not present (because said software is not installed)
win_chocolatey:
name: git
state: absent
Works perfectly fine, after manually installing git. I have tried installing the package manually using the command win_chocolatey would use (according to the docs) and it works. Using the exact same user as the playbook is. (has admin rights)
I've also tried to force the admin account with become, (even though it already runs admin) but it mattered not.
-vvvv is not even showing an install attempt either:
TASK [Download and install chocolatey packages] **************************************************************
task path: /usr/user/clouddrive/windows-vm/create-vm-windows.yml:162
Using module file /opt/ansible/local/lib/python2.7/site-packages/ansible/modules/windows/win_chocolatey.ps1
<my.ip.address.here> ESTABLISH WINRM CONNECTION FOR USER: AzureAdministrator on PORT 5986 TO my.ip.address.here
checking if winrm_host my.ip.address.here is an IPv6 address
EXEC (via pipeline wrapper)
failed: [my.ip.address.here] (item={u'choco_name': u'git', u'choco_state': u'present'}) => {
"changed": false,
"command": "C:\\ProgramData\\chocolatey\\bin\\choco.exe list --local-only --exact --limit-output git",
"item": {
"choco_name": "git",
"choco_state": "present"
},
"msg": "Error checking installation status for the package 'git'",
"rc": 2,
"stderr": "",
"stderr_lines": [],
"stdout": "",
"stdout_lines": []
}
Am I missing something? The docs (https://docs.ansible.com/ansible/latest/modules/win_chocolatey_module.html#examples) say that even something basic like:
- name: Install git
win_chocolatey:
name: git
Should install the package (state present or not, I've tried it with no state, and any other, only absent works)
Enhanced exit codes were added to Chocolatey 0.10.12 which is listed as a breaking change.
chocolatey.org/docs/release-notes
Ansible changes are already being made to fix this (see github.com/chocolatey/choco/issues/1758), but for now you can disable the feature per the release notes
choco feature disable --name="'useEnhancedExitCodes'"

using 'supervisorctl' in ansible playbook; Error "Failed to find executable supervisorctl"

Overview: I'm trying to install supervisor and run program process within an ansible playbook.
I'm able to install supervisor and such but when I try to get into supervisorctl to run a simple program, it's unable to find the executable supervisorctl.
This is the portion of the code that fails:
- supervisorctl:
name=program:CAT
state=started
config=/etc/supervisor/supervisord.conf
with the resulting error:
TASK [supervisorctl] ***********************************************************
fatal: [localhost]: FAILED! => {"changed": false, "failed": true, "msg": "Failed to find required executable supervisorctl"}
However, when I run the simple command in my terminal, it works:
supervisord -c /etc/supervisord.conf
and I can view the program running by going into 'supervisorctl' in the terminal and typing 'status':
$ supervisorctl
CAT STOPPED Dec 27 04:12 PM
supervisor>
Can anyone point me to what/where my error most likely is?
I would guess the error message is suggesting I did not correctly install supervisor but the fact that I can do these things out of the playbook makes me think I can. I tried specifiying the path to the executable 'supervisorctl' but I don't think that's legal syntax in the playbook because that creates other errors.
*Worth noting, I'm in a virtualenv that runs python2.7
I realized that there is a parameter option in the configure documentation that allows me to specify the path to the supervisorctl executable and that worked! (in that I now have a different error)
Modified the above script to look like:
- supervisorctl:
name=program:CAT
state=started
config=/etc/supervisor/supervisord.conf
supervisorctl_path=/usr/bin/supervisorctl

Ansible and Fedora23 - "firewalld required for this module"

I'm trying to set up my firewalld through Ansible on my Fedora 23 server from my Fedora client (Yes I like fedora :D ).
However, each time I try to execute a playbook with some commands including firewalld (Example - firewalld: service=https permanent=true state=enabled), the playbook execution fail with the following message :
failed: [w.x.y.z] => {"failed": true, "parsed": false}
failed=True msg='firewalld required for this module'
I have firewalld up and running on the remote server :
# firewall-cmd --version
0.3.14.2
On my computer :
$ ansible --version
ansible 1.9.4
configured module search path = None
Does anyone know where it could come from ?
Thank you !
--
EDIT: At this line in Ansible source code, firewall library seems not to be imported (and execute error which display that there is no firewall). However, this library exists in Python3 and not Python2 which is used by Ansible.
$ locate firewall
[...]
/usr/lib/python3.4/site-packages/firewall
[...]
I will continue to search, but if someone has an idea...
I found the explanation and solution :
Following my edit, I installed python-firewall which is python 2 bindings of firewalld. But, the execution was incorrect because of the absence of cockpit.
So I had to install cockpit too...
Long story, short story, this is what I've done on remote machine :
# dnf install python-firewall cockpit -y

win_get_url fails to get the file from the url to a remote windows server 2012 node

I need to copy a file from a jenkins server to a remote Windows server 2012 machine using win_get_url
My playbook looks as follows:
hosts: windows_ip tasks:
name: Deploy to windows
win_get_url:
url: 'http://(jenkins_server_ip)/jenkins/view/Trunk/job/router/lastSuccessfulBuild/artifact/router/conf/router-service-context.xml'
dest: 'D:\router'
However it gives the following error:
fatal: [windows_ip]: FAILED! => {"changed": false, "failed": true, "msg": "Error downloading http://(jenkins_server_ip)/jenkins/view/Trunk/job/router/lastSuccessfulBuild/artifact/router/conf/router-service-context.xml to D:\router Exception calling \"DownloadFile\" with \"2\" argument(s): \"An exception occurred during a WebClient request.\""}
What is the issue over here?
For anyone else that comes across this issue, the problem is that the underlying powershell script will not create the destination directory if it does not exist.
I'd troubleshoot this by looking at the exact script. First. make sure ansible leaves it's script on the target node by running the following on the control node:
export ANSIBLE_KEEP_REMOTE_FILES=1
Re-run your playbook, and then log on to the windows box. Ansible's files will be in C:\users\\appdata\local\temp\ansiblexxxxx
Run/debug the script locally to figure out whats happening.

Resources