After successfully install of CAM the "cam-mongo" pod went down - ibm-cloud-private

After successful deployment of CAM (Was up and running for couple of days), suddenly "cam-mongo" microservice went down and while checking the logs for pod using below 2 command it gives you Error synching pod
1) kubectl describe pods -n services
Warning BackOff 3s (x3 over 18s) kubelet, 9.109.191.126 Back-off restarting failed container
Warning FailedSync 3s (x3 over 18s) kubelet, 9.109.191.126 Error syncing pod
With this information you don't know what went wrong and how do you fix it
2) kubectl -n services logs cam-mongo-5c89fcccbd-r2hv4 -p (with -p option you can grab the logs from previously running container )
The above command show below information:
exception in initAndListen: 98 Unable to lock file: /data/db/mongod.lock Resource temporarily unavailable. Is a mongod instance already running?, terminatingConclusion:
While starting the container inside "cam-mongo" pod it was unable to use the existing /data/db/mongod.lock file and hence your pod will be not up and running and you cannot access CAM

After further analysis I resolved the issue as following:
1) spin up a container and mount the cam-mongo volume within it.
To do this I used the below pod creation yaml which will mount the concern pv's where /data/db/ is present.
kind: Pod
apiVersion: v1
metadata:
name: mongo-troubleshoot-pod
spec:
volumes:
name: cam-mongo-pv
persistentVolumeClaim:
claimName: cam-mongo-pv
containers:
name: mongo-troubleshoot
image: nginx
ports:
containerPort: 80
name: "http-server"
volumeMounts:
mountPath: "/data/db"
name: cam-mongo-pv
RUN:kubectl -n services create -f ./mongo-troubleshoot-pod.yaml
2) Use "docker exec -it /bin/bash " (look for it from "kubectl -n services describe po/mongo-troubleshoot-pod-xxxxx" info)
cd /data/db
rm mongod.lock
rm WiredTiger.lock
3) kill the pod which you have created for troubleshooting
4) kill the corrupted cam-mongo pod using below command
kubectl delete pods -n services
It fixed the issue.

Related

Minikube Accessing Images Added With Registry Addon

I’ve followed the instructions outlined on this page and pushed a local image to a local 3 node Minikube cluster with the registry add-on enabled and the cluster started with insecure-registry flag, but I get the following error when I try to create a Pod with the image:
Normal Pulling 9m18s (x4 over 10m) kubelet Pulling image "192.168.99.100:5000/myapp:v1”
Warning Failed 9m18s (x4 over 10m) kubelet Failed to pull image "192.168.99.100:5000/myapp:v1”: rpc error: code = Unknown desc = Error response from daemon: Get "https://192.168.99.100:5000/v2/": http: server gave HTTP response to HTTPS client
Any advice on resolving this would be greatly appreciated
My Minikube (v1.23.2) is on macOS (Big Sur 11.6) using the VirtualBox driver. It is a three node cluster. My Docker Desktop version is (20.10.8)
These are the steps I followed:
Get my cluster’s VMs’ IP range - 192.168.99.0/24
Added the following entry to my Docker Desktop config:
insecure-registries": [
"192.168.99.0/24"
]
Started Minikube with insecure registries flag:
$ minikube start —insecure-registry=“192.168.99.0/24”
Run:
$ minikube addons enable registry
Tagged the image I want to push:
$ docker tag docker.io/library/myapp:v1 $(minikube ip):5000/myapp:v1
Pushed the image:
$ docker push $(minikube ip):5000/myapp:v1
The push works ok - when I exec onto the registry Pod I can see the image in the filesystem. However, when I try to create a Pod using the image, I get the error mentioned above.
My Pod manifest is:
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: myapp
name: myapp
spec:
containers:
- image: 192.168.99.100:5000/myapp:v1
name: myapp
imagePullPolicy: IfNotPresent
resources: {}
dnsPolicy: ClusterFirst
restartPolicy: Always
The issue was resolved by deleting the cluster and recreating it using the insecure-registry flag from the start - originally I had created the cluster, stopped it, and then started it again with the insecure-registry flag. For some reason this didn't work, but starting it for the first time with the flag did.
If you're going to be creating clusters with the registry addon a lot, it might be worth adding the flag permanently to your config. Replace the IP with your cluster's subnet:
$ minikube config set insecure-registry "192.168.99.0/24"

How can I diagnose why a k8s pod keeps restarting?

I deploy a elasticsearch to minikube with below configure file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: elasticsearch
spec:
replicas: 1
selector:
matchLabels:
name: elasticsearch
template:
metadata:
labels:
name: elasticsearch
spec:
containers:
- name: elasticsearch
image: elasticsearch:7.10.1
ports:
- containerPort: 9200
- containerPort: 9300
I run the command kubectl apply -f es.yml to deploy the elasticsearch cluster.
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
elasticsearch-fb9b44948-bchh2 1/1 Running 5 6m23s
The elasticsearch pod keep restarting every a few minutes. When I run kubectl describe pod command, I can see these events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m11s default-scheduler Successfully assigned default/elasticsearch-fb9b44948-bchh2 to minikube
Normal Pulled 3m18s (x5 over 7m11s) kubelet Container image "elasticsearch:7.10.1" already present on machine
Normal Created 3m18s (x5 over 7m11s) kubelet Created container elasticsearch
Normal Started 3m18s (x5 over 7m10s) kubelet Started container elasticsearch
Warning BackOff 103s (x11 over 5m56s) kubelet Back-off restarting failed container
The last event is Back-off restarting failed but I don't know why it restarts the pod. Is there any way I can check why it keeps restarting?
The first step (kubectl describe pod) you've already done. As a next step I suggest checking container logs: kubectl logs <pod_name>. 99% you get the reason from logs in this case (I bet on bootstrap check failure).
When neither describe pod nor logs do not have anything about the error, I get into the container with 'exec': kubectl exec -it <pod_name> -c <container_name> sh. With this you'll get a shell inside the container (of course if there IS a shell binary in it) ans so you can use it to investigate the problem manually. Note that to keep failing container alive you may need to change command and args to something like this:
command:
- /bin/sh
- -c
args:
- cat /dev/stdout
Be sure to disable probes when doing this. A container may restart if liveness probe fails, you will see that in kubectl describe pod if it happen. Since your snippet doesn't have any probes specified, you can skip this.
Checking logs of the pod using kubectl logs podname gives clue about what could go wrong.
ERROR: [2] bootstrap checks failed
[1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
ERROR: Elasticsearch did not exit normally - check the logs at /usr/share/elasticsearch/logs/docker-cluster.log
Check this post for a solution

Kubernetes + Minikube - How to see all stdout output?

I'm running a Ruby app on Kubernetes with Minikube.
However, whenever I look at the logs I don't see the output I would have seen in my terminal when running the app locally.
I presume it's because it only shows stderr?
What can I do to see all types of console logs (e.g. from puts or raise)?
On looking around is this something to do with it being in detached mode - see the Python related issue: Logs in Kubernetes Pod not showing up
Thanks.
=
As requested - here is the deployment.yaml
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: sample
spec:
replicas: 1
template:
metadata:
labels:
app: sample
spec:
containers:
- name: sample
image: someregistry
imagePullPolicy: Always
command: ["/bin/sh","-c"]
args: ["bundle exec rake sample:default --trace"]
envFrom:
- configMapRef:
name: sample
- secretRef:
name: sample
ports:
- containerPort: 3000
imagePullSecrets:
- name: regsecret
As shown in this article, kubectl logs pod apod should show you stdout and stderr for a pod deployed in a minikube.
By default in Kubernetes, Docker is configured to write a container's stdout and stderr to a file under /var/log/containers on the host system
Kubernetes adds:
There are two types of system components: those that run in a container and those that do not run in a container.
For example:
The Kubernetes scheduler and kube-proxy run in a container.
The kubelet and container runtime, for example Docker, do not run in containers.
And:
On machines with systemd, the kubelet and container runtime write to journald.
If systemd is not present, they write to .log files in the /var/log directory.
Similarly to the container logs, system component logs in the /var/log directory should be rotated.
In Kubernetes clusters brought up by the kube-up.sh script, those logs are configured to be rotated by the logrotate tool daily or once the size exceeds 100MB.
I presume it's because it only shows stderr?
Not really, unless something specific is disabled in your container or pod spec. I assume you are using Docker so then the default it's to output stdout and stderr and that's what you see when you do a kubectl logs <pod-name>
What can I do to see all types of console logs (e.g. from puts or raise)?
You should see them in the container logs. It would help to post your pod or deployment definition.

Not able to pull image from local docker registry when tried to create a POD using yaml file

I'm not able to pull an image from my local docker registry when I try to create a POD using yaml file. Here are the diagnostic steps taken:
kubectl describe pod <pod name>
error msg: Error while pulling image: Get http://localhost:5001/v1/repositories/hello_world_application/images: dial tcp [::1]:5001: getsockopt: connection refused
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
img_app latest 259dc6d7b6fe 18 hours ago 689.3 MB
localhost:5001/img_app latest 259dc6d7b6fe 18 hours ago 689.3 MB
docker.io/python 2.7 ca388cdb5ac1 8 days ago 676.1 MB
netstat -anp | grep 5001
tcp6 0 0 :::5001 :::* LISTEN 31971/registry
Here's my Pod configuration (my-pod.yaml):
apiVersion: v1
kind: Pod
metadata:
name: my-pod-1
labels:
name: my-pod-label
spec:
containers:
- name: my-application
image: localhost:5001/img_app:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 5000
name: my-pod-label
Reading through your debugging output, it seems that you have the images laying on your local docker host without running a registry. Therefore using imagePullPolicy: Never in your POD config would allow you to use the local images from the docker daemon without going the detour through an additional registry.

Kubernetes Ingress Controller on Vagrant

Is there anything special about running ingress controllers on Kubernetes CoreOS Vagrant Multi-Machine? I followed the example but when I run kubectl -f I do not get an address.
Example:
http://kubernetes.io/v1.1/docs/user-guide/ingress.html#single-service-ingress
Setup:
https://coreos.com/kubernetes/docs/latest/kubernetes-on-vagrant.html
I looked at networking in kubernetes. Everything looks like it should run without further configuration.
My goal is to create a local testing environment before I build out a production platform. I'm thinking there's something about how they setup their virtualbox networking. I'm about to dive into the CoreOS cloud config but thought I would ask first.
UPDATE
Yes I'm running an ingress controller.
https://github.com/kubernetes/contrib/blob/master/Ingress/controllers/nginx-alpha/rc.yaml
It runs without giving an error. It's just when I run kubectl -f I do not get an address. I'm thinking there's either two things:
I have to do something extra in networking for CoreOS-Kubernetes vagrant multi-node.
It's running right, but I'm point my localhost to the wrong IP. I'm using a 172.17.4.x ip, I also have 10.0.0.x . I can access services through the 172.17.4.x using a NodePort, but I can get to my Ingress.
Here is the code:
apiVersion: v1
kind: ReplicationController
metadata:
name: nginx-ingress
labels:
app: nginx-ingress
spec:
replicas: 1
selector:
app: nginx-ingress
template:
metadata:
labels:
app: nginx-ingress
spec:
containers:
- image: gcr.io/google_containers/nginx-ingress:0.1
imagePullPolicy: Always
name: nginx
ports:
- containerPort: 80
hostPort: 80
Update 2
Output of commands:
kubectl get pods
NAME READY STATUS RESTARTS AGE
echoheaders-kkja7 1/1 Running 0 24m
nginx-ingress-2wwnk 1/1 Running 0 25m
kubectl logs nginx-ingress-2wwnk --previous
Pod "nginx-ingress-2wwnk" in namespace "default": previous terminated container "nginx" not found
kubectl exec nginx-ingress-2wwnk -- cat /etc/nginx/nginx.conf
events {
worker_connections 1024;
}
http {
}%
I'm running an echoheaders service on NodePort. When I type the node IP and port on my browser, I get that just fine.
I restarted all nodes in virtualbox too.
With a lot help from kubernetes irc and slack, I fixed this a while back. If I remember correctly, I had the ingress service listening on a port that was already being used, I think for vagrant. These commands really help:
kubectl get pod <nginx-ingress pod> -o json
kubectl exec <nginx-ingress pod> -- cat /etc/nginx/nginx.conf
kubectl get pods -o wide
kubectl logs <nginx-ingress pod> --previous

Resources