Crashlooperror in Google Cloud Shell

Crashlooperror in Google Cloud Shell - google-api

I am taking a course in google cloud on launching a kubernetes engine cluster. I received this when running through twice.
What is the fix for CrashLoopBackOff? I have not been able to locate.
(venv) student_04_9b8cb56b5006#cloudshell:~/cloud-vision/python/awwvision/cloud-vision/python/awwvision (qwiklabs-gcp-00-128898864713)$ kubectl get pods
W0619 19:58:53.278025 3544 gcp.go:120] WARNING: the gcp auth plugin is deprecated in v1.22+, unavailable in v1.25+; use gcloud instead.
To learn more, consult https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke
NAME READY STATUS RESTARTS AGE
awwvision-webapp-55f5dbb8c7-mdtnq 0/1 CrashLoopBackOff 9 (2m45s ago) 24m
awwvision-worker-79c846b86d-f9mvp 0/1 CrashLoopBackOff 9 (2m4s ago) 23m
awwvision-worker-79c846b86d-lhnt8 0/1 CrashLoopBackOff 9 (2m25s ago) 23m
awwvision-worker-79c846b86d-t79zc 0/1 CrashLoopBackOff 9 (2m45s ago) 23m
redis-master-6c59fc54c-ldk8t 1/1 Running 0 25m

Related

Kubernetes windows worker node addition: "failed to create containerd task: hcsshim::CreateComputeSystem kube-proxy: The directory name is invalid"

I am using the Kubernetes(v1.23.13) with the container and Flannel CNI. The Kubernetes cluster created on ubuntu (v 18) VM(vmware esxi) and windows server running on another VM. I follow the link below to add the windows(windows server 2019) node to the cluster. Windows node added the cluster. But the windows kube-proxy and demonset pod deployment has failed.
Link https://web.archive.org/web/20220530090758/https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/adding-windows-nodes/
Error: Normal Created (x5 over ) kubelet Created container kube-proxy
Normal Pulled (x5 over ) kubelet Container image "sigwindowstools/kube-proxy:v1.23.13-nanoserver" already present on machine
Warning Failed kubelet Error: failed to create containerd task: hcsshim::CreateComputeSystem kube-proxy: The directory name is invalid.
(extra info: {"Owner":"containerd-shim-runhcs-v1.exe","SchemaVersion":{"Major":2,"Minor":1},"Container":{"GuestOs":{"HostName":"kube-proxy-windows-hq7bb"},"Storage":{"Layers":[{"Id":"e30f10e1-6696-5df6-af3f-156a372bce4e","Path":"C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\19"},{"Id":"8aa59a8b-78d3-5efe-a3d9-660bd52fd6ce","Path":"C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\18"},{"Id":"f222f973-9869-5b65-a546-cb8ae78a32b9","Path":"C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\17"},{"Id":"133385ae-6df6-509b-b342-bc46338b3df4","Path":"C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\16"},{"Id":"f6f9524c-e3f0-5be2-978d-7e09e0b21299","Path":"C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\15"},{"Id":"0d9d58e6-47b6-5091-a552-7cc2027ca06f","Path":"C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\14"},{"Id":"6715ca06-295b-5fba-9224-795ca5af71b9","Path":"C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\13"},{"Id":"75e64a3b-69a5-52cf-b39f-ee05718eb1e2","Path":"C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\12"},{"Id":"8698c4b4-b092-57c6-b1eb-0a7ca14fcf4e","Path":"C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\11"},{"Id":"7c9a6fb7-2ca8-5ef7-bbfe-cabbff23cfa4","Path":"C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\10"},{"Id":"a10d4ad8-f2b1-5fd6-993f-7aa642762865","Path":"C:\ProgramData\containerd\root\io.containerd.snapshotter.v1.windows\snapshots\9"}],"Path":"\\?\Volume{64336318-a64f-436e-869c-55f9f8e4ea62}\"},"MappedDirectories":[{"HostPath":"c:\","ContainerPath":"c:\host"},{"HostPath":"c:\var\lib\kubelet\pods\1cd0c333-3cd0-4c90-9d22-884ea73e8b69\containers\kube-proxy\0e58a001","ContainerPath":"c:\dev\termination-log"},{"HostPath":"c:\var\lib\kubelet\pods\1cd0c333-3cd0-4c90-9d22-884ea73e8b69\volumes\kubernetes.io~configmap\kube-proxy","ContainerPath":"c:\var\lib\kube-proxy","ReadOnly":true},{"HostPath":"c:\var\lib\kubelet\pods\1cd0c333-3cd0-4c90-9d22-884ea73e8b69\volumes\kubernetes.io~configmap\kube-proxy-windows","ContainerPath":"c:\var\lib\kube-proxy-windows","ReadOnly":true},{"HostPath":"c:\var\lib\kubelet\pods\1cd0c333-3cd0-4c90-9d22-884ea73e8b69\volumes\kubernetes.io~projected\kube-api-access-4zs46","ContainerPath":"c:\var\run\secrets\kubernetes.io\serviceaccount","ReadOnly":true},{"HostPath":"c:\var\lib\kubelet\pods\1cd0c333-3cd0-4c90-9d22-884ea73e8b69\etc-hosts","ContainerPath":"C:\Windows\System32\drivers\etc\hosts"}],"MappedPipes":[{"ContainerPipeName":"rancher_wins","HostPath":"\\.\pipe\rancher_wins"}],"Networking":{"Namespace":"4a4d0354-251a-4750-8251-51ae42707db2"}},"ShouldTerminateOnLastHandleClosed":true}): unknown
Warning BackOff (x23 over ) kubelet Back-off restarting failed container
kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-64897985d-2mkd5 1/1 Running 0 19h
kube-system coredns-64897985d-qhhbz 1/1 Running 0 19h
kube-system etcd-scspa2658542001 1/1 Running 2 19h
kube-system kube-apiserver-scspa2658542001 1/1 Running 8 (3h4m ago) 19h
kube-system kube-controller-manager-scspa2658542001 1/1 Running 54 (126m ago) 19h
kube-system kube-flannel-ds-hjw8s 1/1 Running 14 (18h ago) 19h
kube-system kube-flannel-ds-windows-amd64-xfhjl 0/1 ImagePullBackOff 0 29m
kube-system kube-proxy-windows-hq7bb 0/1 CrashLoopBackOff 10 (<invalid> ago) 29m
kube-system kube-proxy-wx2x9 1/1 Running 0 19h
kube-system kube-scheduler-scspa2658542001 1/1 Running 92 (153m ago) 19h

From this issue, it seems windows nodes with flannel has issues they have solved with different work arounds,
As mentioned in the issue they have made a guide to work windows properly, Follow this doc with the installation guide and requirements.
Attaching troubleshooting blog and issue for crashloop backoff.

I had a similar error failed to create containerd task: hcsshim::CreateComputeSystem with flannel on k8s v1.24. The cause was that Windows OS patches had not been applied. You must have applied the patch related to KB4489899.
https://github.com/kubernetes-sigs/sig-windows-tools/blob/master/guides/guide-for-adding-windows-node.md#before-you-begin

Kubernetes restart how to run terminated processes

Am new to Kubernetes, my question is related to Google Cloud platform.
Given a scenario we need to restart a kubernetes cluster and we have some services in Spring boot. As Spring boot services are like individual JVM's each and run like an independent process. Once the Kubernetes is restarted in order to restart the Spring boot services I need help in understanding what type of a script or mechanism to use to restart all the services in Kubernetes. Please let me know and thank you and appreciate all your inputs.

I am not sure if I fully understood your question but I think the best approach for you would be to Pack your Spring Boot app to a Docker container and then use it on GKE.
Good guide about Packing your Spring Boot application to container can be found in CodeLabs tutorial.
When you will have your application in container you will be able to use it in Deployment or Statefulsets configuration file and deploy it in your cluster.
As mentioned in Deployment Documentation:
A Deployment provides declarative updates for Pods and ReplicaSets.
You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments.
In short, Deployment controller ensure to keep your application in your desired state.
For example if you would like to restart your application you just could scale down Deployment to 0 replicas and scale up to 5 replicas.
Also as GKE is working on Google Compute Engine VMs you can also scale your cluster nodes number.
Examples
Restarting Application
For my test I've used Nginx container in Deployment but it should work similar with your Spring boot app container.
Let's say you have 2 node cluster with 5 replicas aplication.
$ kubectl create deployment nginx --image=nginx --replicas=5
deployment.apps/nginx created
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-86c57db685-2x8tj 1/1 Running 0 2m45s 10.4.1.5 gke-cluster-1-default-pool-faec7b51-6kc3 <none> <none>
nginx-86c57db685-6lpfg 1/1 Running 0 2m45s 10.4.1.6 gke-cluster-1-default-pool-faec7b51-6kc3 <none> <none>
nginx-86c57db685-8lvqq 1/1 Running 0 2m45s 10.4.0.9 gke-cluster-1-default-pool-faec7b51-x07n <none> <none>
nginx-86c57db685-lq6l7 1/1 Running 0 2m45s 10.4.0.11 gke-cluster-1-default-pool-faec7b51-x07n <none> <none>
nginx-86c57db685-xn7fn 1/1 Running 0 2m45s 10.4.0.10 gke-cluster-1-default-pool-faec7b51-x07n <none> <none>
Now you would need to change some environment variables inside your application using ConfigMap. To apply this change you could just use rollout. It would restart your application and provide additional data from ConfigMap.
$ kubectl rollout restart deployment nginx
deployment.apps/nginx restarted
$ kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-6c98778485-2k98b 1/1 Running 0 6s 10.4.0.13 gke-cluster-1-default-pool-faec7b51-x07n <none> <none>
nginx-6c98778485-96qx7 1/1 Running 0 6s 10.4.1.7 gke-cluster-1-default-pool-faec7b51-6kc3 <none> <none>
nginx-6c98778485-qb89l 1/1 Running 0 6s 10.4.0.12 gke-cluster-1-default-pool-faec7b51-x07n <none> <none>
nginx-6c98778485-qqs97 1/1 Running 0 4s 10.4.1.8 gke-cluster-1-default-pool-faec7b51-6kc3 <none> <none>
nginx-6c98778485-skbwv 1/1 Running 0 4s 10.4.0.14 gke-cluster-1-default-pool-faec7b51-x07n <none> <none>
nginx-86c57db685-2x8tj 0/1 Terminating 0 4m38s 10.4.1.5 gke-cluster-1-default-pool-faec7b51-6kc3 <none> <none>
nginx-86c57db685-6lpfg 0/1 Terminating 0 4m38s <none> gke-cluster-1-default-pool-faec7b51-6kc3 <none> <none>
nginx-86c57db685-8lvqq 0/1 Terminating 0 4m38s 10.4.0.9 gke-cluster-1-default-pool-faec7b51-x07n <none> <none>
nginx-86c57db685-xn7fn 0/1 Terminating 0 4m38s 10.4.0.10 gke-cluster-1-default-pool-faec7b51-x07n <nont e> <none>
Draining node to perform node operations
Another example can be when you need to do something with your VMs. You can do it using by draining node.
You can use kubectl drain to safely evict all of your pods from a node before you perform maintenance on the node (e.g. kernel upgrade, hardware maintenance, etc.). Safe evictions allow the pod's containers to gracefully terminate and will respect the PodDisruptionBudgets you have specified.
So it will reschedule all pods from this node to another nodes.
Restarting Cluster
Keep in Mind that GKE is managed by google and you cannot restart one machine as it's managed by Managed instance group.
You can ssh to each node, change some settings. When you scale them to 0 and scale up, you will get new machine with your requirements with new ExternalIP.
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
gke-cluster-1-default-pool-faec7b51-6kc3 Ready <none> 3d1h v1.17.14-gke.1600 10.128.0.25 34.XX.176.56 Container-Optimized OS from Google 4.19.150+ docker://19.3.6
gke-cluster-1-default-pool-faec7b51-x07n Ready <none> 3d1h v1.17.14-gke.1600 10.128.0.24 23.XXX.50.249 Container-Optimized OS from Google 4.19.150+ docker://19.3.6
$ gcloud container clusters resize cluster-1 --node-pool default-pool \
> --num-nodes 0 \
> --zone us-central1-c
Pool [default-pool] for [cluster-1] will be resized to 0.
$ kubectl get nodes -o wide
No resources found
$ gcloud container clusters resize cluster-1 --node-pool default-pool --num-nodes 2 --zone us-central1-c
Pool [default-pool] for [cluster-1] will be resized to 2.
Do you want to continue (Y/n)? y
$ $ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
gke-cluster-1-default-pool-faec7b51-n5hm Ready <none> 68s v1.17.14-gke.1600 10.128.0.26 23.XXX.50.249 Container-Optimized OS from Google 4.19.150+ docker://19.3.6
gke-cluster-1-default-pool-faec7b51-xx01 Ready <none> 74s v1.17.14-gke.1600 10.128.0.27 35.XXX.135.41 Container-Optimized OS from Google 4.19.150+ docker://19.3.6
Conclusion
When you are using GKE you are using pre-definied nodes, managed by google and those nodes are automatically upgrading (some security features, etc). Due to that, changing nodes capacity it's easy.
When you pack your application to container and used it in Deployment, your application will be handled by Deployment Controller which will try to keep desired state all the time.
As mention in Service Documentation.
In Kubernetes, a Service is an abstraction which defines a logical set of Pods and a policy by which to access them
Service will be still visible in your cluster even if you will scale you cluster to 0 node as this is abstraction. You don't have to restart it. However if you would change some static service configuration (like port) you would need to recreate service with new configuration.
Useful links
Migrating workloads to different machine types
Auto-repairing nodes

Open Jupyter notebook in committed image in docker container

I used docker commit to commit my work on a Jupiter notebook on docker, but my computer crashed. When I try to run the docker container, I can't open the notebook at the time of latest commit.
The following bash commands yield:
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7f1b4d6a811f iess:latest "/bin/su -l pycbc -c…" 5 minutes ago Exited (1) 5 minutes ago iess
0dcd955ad0b6 4028090df24a "/bin/su -l pycbc -c…" 14 minutes ago Exited (1) 11 minutes ago vibrant_minsky
d7d76573d511 4028090df24a "/bin/su -l pycbc -c…" 2 days ago Up 32 minutes 8888/tcp relaxed_cartwright
docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
iess latest 6e07932643eb 9 hours ago 4.6GB
<none> <none> dd24d1257a5c 10 hours ago 4.6GB
<none> <none> 0bdaeb277ab9 10 hours ago 4.6GB
<none> <none> 911e848f8167 12 hours ago 4.6GB
<none> <none> de16c7fce855 20 hours ago 4.6GB
<none> <none> 147ed70ecf70 21 hours ago 4.6GB
<none> <none> 792f3f87b8ee 21 hours ago 4.6GB
<none> <none> 79cbcc4abc27 21 hours ago 4.6GB
<none> <none> 9abe343a42b1 21 hours ago 4.6GB
<none> <none> aea2324b9902 44 hours ago 4.6GB
<none> <none> 760e78217518 2 days ago 4.6GB
I am not a docker expert (very very very newbie), but I wanted to open start the last container on the list (d7d76573d511), with the image on top of the list (iess:latest created 9 hours ago).

If you want to use the new image: iess latest 6e07932643eb you need to stop the container d7d76573d51 they are both trying to use the same port which is causing the new one to crash. To stop the container use docker stop <CONTAINER ID>

MiniKube coredns is in CrashLoopBackOff status in AWS EC2

I am new to Minikube. We deployed minikube version: v1.0.1 on AWS EC2. Coredns is showing as CrashLoopBackOff.
kubectl get -n kube-system pods
NAME READY STATUS RESTARTS AGE
coredns-6765558d84-6bpff 0/1 CrashLoopBackOff 531 44h
coredns-6765558d84-9mqz6 0/1 CrashLoopBackOff 531 44h
The logs of these pods are showing:
2019-05-22T06:29:40.959Z [FATAL] plugin/loop: Loop (127.0.0.1:53726 -> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 1771143215983809104.2668792180170228628."
I did try to remove the word loop from a config file - as per another Stackoverflow ticket. CoreDNS started working, but proxy stopped!

Cluster Level Logging with Elasticsearch and Kibana does not work in kubernetes

I am trying to setup cluster logging following below link
http://kubernetes.io/v1.0/docs/getting-started-guides/logging-elasticsearch.html
my config-default.sh
# Optional: Enable node logging.
ENABLE_NODE_LOGGING=**true**
LOGGING_DESTINATION=${LOGGING_DESTINATION:-**elasticsearch**}
# Optional: When set to true, Elasticsearch and Kibana will be setup as part of the cluster bring up.
ENABLE_CLUSTER_LOGGING=true
ELASTICSEARCH_LOGGING_REPLICAS=${ELASTICSEARCH_LOGGING_REPLICAS:-1}
Command
$ sudo kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
kube-dns-v9-epplg 4/4 Running 0 20h
kube-ui-v3-i4von 1/1 Running 0 18h
As you can see that I enabled logging and set logging destination = elasticsearch. I don't see elasticsearch-logging or fluentd-elasticsearch or kibana-logging when i do get pods. It seems like replication controller, service or pods is not created, do I need do anything else to bring up the ElasticSearch and Kibana?

Where are you starting your cluster? I tried to reproduce this on GCE using both the 1.0.7 release and from HEAD and wasn't able to.
Using the 1.0.7 release:
$ kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
elasticsearch-logging-v1-6x82b 1/1 Running 0 3m
elasticsearch-logging-v1-s4bj5 1/1 Running 0 3m
fluentd-elasticsearch-kubernetes-minion-ijpr 1/1 Running 0 1m
fluentd-elasticsearch-kubernetes-minion-nrya 1/1 Running 0 2m
fluentd-elasticsearch-kubernetes-minion-ppls 1/1 Running 0 1m
fluentd-elasticsearch-kubernetes-minion-sy4x 1/1 Running 0 2m
kibana-logging-v1-6qka9 1/1 Running 0 3m
kube-dns-v8-9hyzm 4/4 Running 0 3m
kube-ui-v1-11r3b 1/1 Running 0 3m
monitoring-heapster-v6-4uzam 1/1 Running 1 3m
monitoring-influx-grafana-v1-euc3a 2/2 Running 0 3m
From head:
$ kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
elasticsearch-logging-v1-9gqs8 1/1 Running 0 3m
elasticsearch-logging-v1-edb97 1/1 Running 0 3m
etcd-server-events-kubernetes-master 1/1 Running 0 3m
etcd-server-kubernetes-master 1/1 Running 0 3m
fluentd-elasticsearch-kubernetes-master 1/1 Running 0 2m
fluentd-elasticsearch-kubernetes-minion-6id6 1/1 Running 0 1m
fluentd-elasticsearch-kubernetes-minion-n25a 1/1 Running 0 1m
fluentd-elasticsearch-kubernetes-minion-x4wa 1/1 Running 0 1m
heapster-v10-ek03n 1/1 Running 0 3m
kibana-logging-v1-ybsad 1/1 Running 0 3m
kube-apiserver-kubernetes-master 1/1 Running 0 3m
kube-controller-manager-kubernetes-master 1/1 Running 0 3m
kube-dns-v9-dkmad 4/4 Running 0 3m
kube-scheduler-kubernetes-master 1/1 Running 0 3m
kube-ui-v3-mt7nw 1/1 Running 0 3m
l7-lb-controller-b56yf 2/2 Running 0 3m
monitoring-influxdb-grafana-v2-lxufh 2/2 Running 0 3m
The only thing I changed in config-default.sh is the KUBE_LOGGING_DESTINATION variable from gcp to elasticsearch:
$ git diff cluster/gce/config-default.sh
diff --git a/cluster/gce/config-default.sh b/cluster/gce/config-default.sh
index fd31820..2e37ebc 100755
--- a/cluster/gce/config-default.sh
+++ b/cluster/gce/config-default.sh
## -58,7 +58,7 ## ENABLE_CLUSTER_MONITORING="${KUBE_ENABLE_CLUSTER_MONITORING:-googleinfluxdb}"
# Optional: Enable node logging.
ENABLE_NODE_LOGGING="${KUBE_ENABLE_NODE_LOGGING:-true}"
-LOGGING_DESTINATION="${KUBE_LOGGING_DESTINATION:-gcp}" # options: elasticsearch, gcp
+LOGGING_DESTINATION="${KUBE_LOGGING_DESTINATION:-elasticsearch}" # options: elasticsearch, gcp
# Optional: When set to true, Elasticsearch and Kibana will be setup as part of the cluster bring up.
ENABLE_CLUSTER_LOGGING="${KUBE_ENABLE_CLUSTER_LOGGING:-true}"

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio