Worker nodes not available - ibm-cloud-private

I have setup and installed IBM Cloud private CE with two ubuntu images in Virtual Box. I can ssh into both images and from there ssh into the others. The ICp dashboard shows only one active node I was expecting two.
I explicitly ran the command (from a root user on master node):
docker run -e LICENSE=accept --net=host \
-v "$(pwd)":/installer/cluster \
ibmcom/cfc-installer install -l \
192.168.27.101
The result of this command seemed to be a successful addition of the worker node:
PLAY RECAP *********************************************************************
192.168.27.101 : ok=45 changed=11 unreachable=0 failed=0
But still the worker node isn't showing in the dashboard.
What should I be checking to ensure the worker node will work for the master node?

If you're using Vagrant to configure IBM Cloud Private, I'd highly recommend trying https://github.com/IBM/deploy-ibm-cloud-private
The project will use a Vagrantfile to configure a master/proxy and then provision 2 workers within the image using LXD. You'll get better density and performance on your laptop with this configuration over running two full Virtual Box images (1 for master/proxy, 1 for the worker).

You can check on your worker node with following steps:
check cluster nodes status
kubectl get nodes to check status of the newly added worker node
if it's NotReady, check kubelet log if there is error message about why kubelet is not running properly:
ICp 2.1
systemctl status kubelet
ICp 1.2
docker ps -a|grep kubelet to get kubelet_containerid,
docker logs kubelet_containerid

Run this to get the kubectl working
ln -sf /opt/kubernetes/hyperkube /usr/local/bin/kubectl
run the below command to identified failed pods if any in the setup on the master node.
Run this to get the pods details running in the environment
kubectl -n kube-system get pods -o wide
for restarting any failed pods of icp
txt="0/";ns="kube-system";type="pods"; kubectl -n $ns get $type | grep "$txt" | awk '{ print $1 }' | xargs kubectl -n $ns delete $type
now run the kubectl cluster-info
kubectl get nodes
Then ckeck the cluster info command of kubectl
Check kubectl version is giving you https://localhost:8080 or https://masternodeip:8001
kubectl cluster-info
Do you get the output
if no..
then
login to https://masternodeip:8443 using admin login
and then copy the configure clientcli settings by clicking on admin on the panel
paste it in ur master node.
and run the
kubectl cluster-info

Related

how to enable port forward with micrpk8s

I'm playing around with microk8s and I simply want to run an apache server and navigate to its default page on the same machine. I'm on a mac arm m1:
microk8s kubectl run test-pod --image=ubuntu/apache2:2.4-20.04_beta --port=80
~ $ microk8s kubectl get pods 2
NAME READY STATUS RESTARTS AGE
test-pod 1/1 Running 0 8m43s
then I try to enable the forward:
◼ ~ $ microk8s kubectl port-forward test-pod :80
Forwarding from 127.0.0.1:37551 -> 80
but:
◼ ~ $ wget http://localhost:37551
--2022-12-24 18:54:37-- http://localhost:37551/
Resolving localhost (localhost)... 127.0.0.1, ::1
Connecting to localhost (localhost)|127.0.0.1|:8080... failed: Connection refused.
Connecting to localhost (localhost)|::1|:8080... failed: Connection refused.
the logs looks ok:
◼ ~ $ microk8s kubectl logs test-pod 130
AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 10.1.254.96. Set the 'ServerName' directive globally to suppress this message
dashboard proxy does work fine and I can navigate to it:
◼ ~ $ microk8s dashboard-proxy
Checking if Dashboard is running.
Dashboard will be available at https://192.168.64.2:10443
Answering myself:
I should use the Multipass' guest machine assigned IP. This is not docker :)
For some reason I haven't figured out, as asked here, the forwarding from the guest does not work properly on mac. I should open a guest's shell and forward from there. that way, it will work. See the answer on the linked post.
Hope this will spare some time on future mac users.

How to get access to Spark shell from Kubernetes?

I've used the helm chart to deploy Spark to Kubernetes in GCE. According to default configuration in values.yaml the Spark is deployed to the path /opt/spark. I've checked that Spark has deployed successfully by running kubectl --namespace=my-namespace get pods -l "release=spark". There is 1 master and 3 workers running.
However when I've tried to check Spark version by executing spark-submit --version from the Google cloud console it returned -bash: spark-submit: command not found.
I've navigated to the /opt directory and the /spark folder is missing. What should I do to be able to open Spark shell Terminal and to execute Spark commands?
You can verify by checking service
kubectl get services -n <namespace>
you can port-forward particular service and try running locally to check
kubectl port-forward svc/<service name> <external port>:<internal port or spark running port>
Locally you can try running spark terminal it will be connected to spark running on GCE instance.
If you check the helm chart document there is also options for UI you can also do same to access UI via port-forward
Access via SSH inside pod
Kubectl exec -it <spark pod name> -- /bin/bash
here you can directly run spark commands. spark-submit --version
Access UI
Access UI via port-forwarding if you have enable UI in helm chart.
kubectl port-forward svc/<spark service name> <external port>:<internal port or spark running port>
External Load balancer
This particular helm chart also creating External Load balancer you can also get External IP using
Kubectl get svc -n <namespace>
Access Shell
If want to connect via LB IP & port
./bin/spark-shell --conf spark.cassandra.connection.host=<Load balancer IP> spark.cassandra-connection.native.port=<Port>
Creating connection using port-forward
kubectl port-forward svc/<spark service name> <external(local) port>:<internal port or spark running port>
./bin/spark-shell --conf spark.cassandra.connection.host=localhost spark.cassandra-connection.native.port=<local Port>
One way would be login to pod and then run Spark commands
List the pod
kubectl --namespace=my-namespace get pods -l "release=spark"
Now, Login to the pod using following command:
kubectl exec -it <pod-id> /bin/bash
Now, you should be inside the pod and can run spark commands
spark-submit --version
Ref: https://kubernetes.io/docs/tasks/debug-application-cluster/get-shell-running-container/#getting-a-shell-to-a-container
Hope this helps.
This worked for me.
spark-shell --master k8s://localhost:32217
My spark master is a LoadBalancer exposed at localhost:32217

IBM Cloud Private monitoring gets 502 bad gateway

The following containers are not starting after installing IBM Cloud Private. I had previously installed ICP without a Management node and was doing a new install after having done and 'uninstall' and did restart the Docker service on all nodes.
Installed a second time with a Management node defined, Master/Proxy on a single node, and two Worker nodes.
Selecting menu option Platform / Monitoring gets 502 Bad Gateway
Event messages from deployed containers
Deployment - monitoring-prometheus
TYPE SOURCE COUNT REASON MESSAGE
Warning default-scheduler 2113 FailedScheduling
No nodes are available that match all of the following predicates:: MatchNodeSelector (3), NoVolumeNodeConflict (4).
Deployment - monitoring-grafana
TYPE SOURCE COUNT REASON MESSAGE
Warning default-scheduler 2097 FailedScheduling
No nodes are available that match all of the following predicates:: MatchNodeSelector (3), NoVolumeNodeConflict (4).
Deployment - rootkit-annotator
TYPE SOURCE COUNT REASON MESSAGE
Normal kubelet 169.53.226.142 125 Pulled
Container image "ibmcom/rootkit-annotator:20171011" already present on machine
Normal kubelet 169.53.226.142 125 Created
Created container
Normal kubelet 169.53.226.142 125 Started
Started container
Warning kubelet 169.53.226.142 2770 BackOff
Back-off restarting failed container
Warning kubelet 169.53.226.142 2770 FailedSync
Error syncing pod
The management console sometimes displays a 502 Bad Gateway Error after installation or rebooting the master node. If you recently installed IBM Cloud Private, wait a few minutes and reload the page.
If you rebooted the master node, take the following steps:
Configure the kubectl command line interface. See Accessing your IBM Cloud Private cluster by using the kubectl CLI.
Obtain the IP addresses of the icp-ds pods. Run the following command:
kubectl get pods -o wide -n kube-system | grep "icp-ds"
The output resembles the following text:
icp-ds-0 1/1 Running 0 1d 10.1.231.171 10.10.25.134
In this example, 10.1.231.171 is the IP address of the pod.
In high availability (HA) environments, an icp-ds pod exists for each master node.
From the master node, ping the icp-ds pods. Check the IP address for each icp-ds pod by running the following command for each IP address:
ping 10.1.231.171
If the output resembles the following text, you must delete the pod:
connect: Invalid argument
Delete each pod that you cannot reach:
kubectl delete pods icp-ds-0 -n kube-system
In this example, icp-ds-0 is the name of the unresponsive pod.
In HA installations, you might have to delete the pod for each master node.
Obtain the IP address of the replacement pod or pods. Run the following command:
kubectl get pods -o wide -n kube-system | grep "icp-ds"
The output resembles the following text:
icp-ds-0 1/1 Running 0 1d 10.1.231.172 10.10.2
Ping the pods again. Check the IP address for each icp-ds pod by running the following command for each IP address:
ping 10.1.231.172
If you can reach all icp-ds pods, you can access the IBM Cloud Private management console when that pod enters the available state.

cannot connect to Minikube on MacOS

I installed minikube as instructed here https://github.com/kubernetes/minikube/releases
and started with with a simple minikube start command.
But the next step, which is as simple as kubectl get pods --all-namespaces fails with
Unable to connect to the server: dial tcp 192.168.99.100:8443: i/o timeout
What did I miss?
I ran into the same issue using my Mac and basically I uninstalled both minikube and Kubectl and installed it as follows:
Installed Minikube.
curl -Lo minikube https://storage.googleapis.com/minikube/releases/v0.8.0/minikube-darwin-amd64 && chmod +x minikube && sudo mv minikube /usr/local/bin/
Installed Kubectl.
curl -Lo kubectl http://storage.googleapis.com/kubernetes-release/release/v1.3.0/bin/darwin/amd64/kubectl && chmod +x kubectl && sudo mv kubectl /usr/local/bin/
Start a cluster, run the command:
minikube start
Minikube will also create a “minikube” context, and set it to default in kubectl. To switch back to this context later, run this command:
kubectl config use-context minikube
Now to get the list of all pods run the command:
kubectl get pods --all-namespaces
Now you should be able to get the list of pods. Also make sure that you don't have a firewall within your network that blocks the connections.
I faced a similar issue on win7 when changed work environment, as you said it is working fine at home but not working at office, high chance it caused by firewall policy, cannot pass TLS verification.
Instead of waste time on troubleshoot(sometimes nothing to do if you cannot turn off firewall), if you just want to test local minikube cluster, would suggest to disable TLS verification.
This is what I have done:
# How to disable minikube TLS verification
## disable TLS verification
$ VBoxManage controlvm minikube natpf1 k8s-apiserver,tcp,127.0.0.1,8443,,8443
$ VBoxManage controlvm minikube natpf1 k8s-dashboard,tcp,127.0.0.1,30000,,30000
$ kubectl config set-cluster minikube-vpn --server=https://127.0.0.1:8443 --insecure-skip-tls-verify
$ kubectl config set-context minikube-vpn --cluster=minikube-vpn --user=minikube
$ kubectl config use-context minikube-vpn
## test kubectl
$ kubectl get pods
## enable local docker client
$ VBoxManage controlvm minikube natpf1 k8s-docker,tcp,127.0.0.1,2374,,2376
$ eval $(minikube docker-env)
$ unset DOCKER_TLS_VERIFY
$ export DOCKER_HOST="tcp://127.0.0.1:2374"
$ alias docker='docker --tls'
## test local docker client
$ docker ps
## test minikube dashboard
curl http://127.0.0.1:30000
Also I make a small script for this for your reference.
Hope it is helpful for you.
You need to just restart minikube. Sometimes I have this problem when my computer has been off for a while. I don't think you need to reinstall anything.
First verify you are in the correct context
$ kubectl config current-context
minikube
Check Minikube status (status should show "Running", mine below showed "Saved")
$ minikube status
minikube: Saved
cluster:
kubectl:
Restart minikube
$ minikube start
Starting local Kubernetes v1.8.0 cluster...
Starting VM...
Getting VM IP address...
Moving files into cluster...
Setting up certs...
Connecting to cluster...
Setting up kubeconfig...
Starting cluster components...
Kubectl is now configured to use the cluster.
Verify it is running (This is what you should see)
$ minikube status
minikube: Running
cluster: Running
kubectl: Correctly Configured: pointing to minikube-vm at 192.168.99.100
I had this issue when connected to Cisco AnyConnect VPN. Once I disconnected, minikube ran fine. Discussion on github here: https://github.com/kubernetes/minikube/issues/4540

Kubernetes Installation with Vagrant & CoreOS and insecure Docker registry

I have followed the steps at https://coreos.com/kubernetes/docs/latest/kubernetes-on-vagrant.html to launch a multi-node Kubernetes cluster using Vagrant and CoreOS.
But,I could not find a way to set an insecure docker registry for that environment.
To be more specific, when I run
kubectl run api4docker --image=myhost:5000/api4docker:latest --replicas=2 --port=8080
on this set up, it tries to get the image thinking it is a secure registry. But, it is an insecure one.
I appreciate any suggestions.
This is how I solved the issue for now. I will add later if I can automate it on Vagrantfile.
cd ./coreos-kubernetes/multi-node/vagrant
vagrant ssh w1 (and repeat these steps for w2, w3, etc.)
cd /etc/systemd/system/docker.service.d
sudo vi 50-insecure-registry.conf
add below line to this file
[Service]
Environment=DOCKER_OPTS='--insecure-registry="<your-registry-host>/24"'
after adding this file, we need to restart the docker service on this worker.
sudo systemctl stop docker
sudo systemctl daemon-reload
sudo systemctl start docker
sudo systemctl status docker
now, docker pull should work on this worker.
docker pull <your-registry-host>:5000/api4docker
Let's try to deploy our application on Kubernetes cluster one more time.
Logout from the workers and come back to your host.
$ kubectl run api4docker --image=<your-registry-host>:5000/api4docker:latest --replicas=2 --port=8080 —env="SPRING_PROFILES_ACTIVE=production"
when you get the pods, you should see the status running.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
api4docker-2839975483-9muv5 1/1 Running 0 8s
api4docker-2839975483-lbiny 1/1 Running 0 8s

Resources